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Chapter 1. Introduction 

This introductory chapter briefly reviews the major motivations for quantum mechanics. Then its 
simplest formalism - Schrodinger ’s wave mechanics - is described, and its main features are discussed 
Much of this material (perhaps except for the last section) may be found in undergraduate textbooks . 1 


1.1. Experimental motivations 

By the beginning of the 1900s, physics (which by that time included what we now know as 
nonrelativistic classical mechanics, classical statistics and thermodynamics, and classical 
electrodynamics including geometric and wave optics) looked as an almost completed discipline, with a 
lot of experimental observations explained, and just a couple of mysterious “dark clouds” 2 on the 
horizon. However, the rapid technological progress and the resulting fast development of experimental 
techniques have led to a fast multiplication of observed phenomena that could not be explained on the 
classical basis. Let me list the most consequential of those experimental findings. 

(i) Blackbodv radiation measurements, started by G. Kirchhoff in 1859, have shown that the in 
the thermal equilibrium, the power of electromagnetic radiation by a fully absorbing (“black”) surface 
per unit frequency interval drops exponentially at high frequencies. This is not what could be expected 
from the combination of the classical electrodynamics and statistics, which predicted an infinite growth 
of the radiation density with frequency. Indeed, classical electrodynamics shows 3 that electromagnetic 
field modes in free space evolve in time as harmonic oscillators, and that the density of these modes in a 
large volume V » A 3 per small frequency interval is 


dV 

dN = 2V k 


= 2y 4 xk 2 dk =y 


CO' 


(2;r) 3 (2;r) 3 


7tf 3 


dco. 


( 1 . 1 ) 


o 

where c ~ 3><10 m/s is the free-space speed of light, co its frequency, k = cole the free-space wave 
number, and A = 2nlk is the radiation wavelength. On the other hand, classical statistics 4 predicts that in 
the thermal equilibrium at temperature T, the average energy E of each ID harmonic oscillator should 
equal kfT, where k\i is the Boltzmann constant. 5 


1 For remedial reading, I can recommend the following textbooks (in the alphabetical order): S. Gasiorowicz, 
Quantum Physics, 3 rd ed., Wiley, 2003; D. Griffith, Quantum Mechanics, 2 nd ed., Pearson Prentice Hall, 2005; and 
R. Liboff, Introductory Quantum Mechanics, 3 rd ed., Addison- Wesley, 1998. 

2 This expression was used in a 1900 talk by Lord Kelvin (bom W. Thomson) in reference to the blackbody 
radiation measurements and Michelson-Morley experiment results, i.e. the precursors of the quantum mechanics 
and relativity theory. 

3 See, e.g., EM Sec. 7.9. The degeneracy factor 2 in Eq. (1) is due to two possible polarizations of transverse 
electromagnetic waves. For waves of other physical nature, which obey with the linear (“acoustic”) dispersion 
law, similar relations are also valid, though possibly with a different degeneracy factor - see, e.g., CM Sec. 7.7. 

4 See, e.g., SM Sec. 2.2. 

5 In the SI units, used through these notes, k B ~ 1.38x10 23 J/K. Note that in many theoretical papers (and in the 
SM part of my notes), k B is taken for 1, i.e. temperature is measured in energy units. 
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Combining these two results, we readily get the so-called Rayleigh- Jeans formula for the 
average electromagnetic wave energy per unit volume: 


u 


I dE k B T dN 
V dco V dco 


co~ 


2 3 B 
71 C 


k B T , 


( 1 . 2 ) 


that diverges at co — > qo. On the other hand, the blackbody radiation measurements, improved by O. 
Lummer and E. Pringsheim, and also H. Rubens and F. Kurlbaum to reach a 1%-scale accuracy, were 
compatible with the phenomenological law suggested in 1900 by Max Planck: 


u = 


of hco 

7T 2 C i exp(hco/ k B T)-\ 


(1.3a) 


The law may be reconciled with the fundamental Eq. (1) if the following replacement is made for the 
average energy of each field oscillator: 


k B T -> 


hco 

exp(ha>/ k B T)-\ 


(1.3b) 


with a constant factor 


h » 1.055x10' ' 34 Js, 


(1.4) 


now called Planck’s constant. 6 At low frequencies (hco « k n T), the denominator in Eq. (3) may be 
approximated as hcolk B T, so that the average energy (3b) tends to its classical value k B T, and the Planck 
law (3a) reduces to the Rayleigh- Jeans formula (2). However, at higher frequencies ( hco» k B T), Eq. (3) 
describes the experimentally observed rapid decrease of the radiation density - see Fig. 1. 



Fig. 1.1. Blackbody radiation density u, expressed 
in units of u 0 = (k B T) 3 / f h 2 f , as a function of 
frequency, according to: the Rayleigh-Jeans 
formula (blue line) and the Planck law (red line). 


(ii) The photoelectric effect, experimentally discovered in 1887 by H. Hertz, shows a sharp 
lower bound on the frequency of light that may kick electrons out from metallic surfaces, regardless of 


6 M. Planck himself wrote hco as hv, where v= din is the “cyclic” frequency, measured in Hz (periods per 
second), so that in early texts the term “Planck’s constant” referred to h = 27th , while h was called “the Dirac 
constant” for a while. 


Planck 

radiation 

law 


Planck’s 
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the light intensity. Albert Einstein, in the first of his three famous 1905 papers, noticed that this 
threshold C 0 m\ n could be readily explained assuming that light consisted of certain particles (now called 
photons) with energy 


Energy 

vs 

frequency 


E = hco = h v , 


(1.5) 


with the same Planck’s constant that participates in Eq. (3). 7 8 Indeed, with this assumption, at the photon 
absorption by the surface, its energy E = hco is divided between a fixed energy W (now called the 
workfunction) of electron binding inside the metal, and the residual kinetic energy mv 12 > 0 of the freed 
electron - see Fig. 2. In this picture, the frequency threshold finds a natural explanation as oo mm = W/h. H 
Moreover, as was shown by S. Bose in 1924, Eq. (5) readily explains 9 Planck’s law (3). 



Fig. 1.2. Einstein’s explanation of the photoelectric 
effect’s frequency threshold. 


(iii) The discrete frequency spectra of radiation by excited atomic gases, known since the 1600s, 
could not be explained by classical physics. (Applied to the planetary model of atoms, proposed by E. 
Rutherford, it predicts the collapse of electrons on nuclei in ~10' 10 s due to electric dipole radiation of 
electromagnetic waves. 10 ) Especially challenging was the observation by J. Balmer (in 1885) that the 
radiation frequencies of simple atoms may be described by simple formulas. For example, for the 
simplest atom, hydrogen, all radiation frequencies may be numbered with just two positive integers n 
and n 




1 1 


\n 


n ' 2 J 


(1.6) 


with coo = £Ui,oo ~ 2.07xl0 16 s' 1 . The Balmer series, including the value of (Oo, have found its first 
explanation in the famous 1913 theory by Niels Bohr, which was a semi-phenomenological precursor 
for quantum mechanics. In this theory, co, hn • is interpreted as the frequency of a photon that obeys the 
Einstein’s formula (5), with its energy E n _ n - being the difference between two quantized (discrete) energy 
levels of the atom (Fig. 3): 


= E„, -E. >0. 


(1.7) 


7 As a reminder, A. Einstein received his only Nobel Prize (in 1922) for exactly this work, which essentially 
started quantum mechanics, rather than for his relativity theory. 

8 For most metals, W is between 4 and 5 electron-volts (eV), so that the threshold corresponds to = 2nd co^ m = 
ch/W ~ 300 nm - approximately at the border between the visible light and ultraviolet radiation. 

9 See, e.g., SM Sec. 2.5. 

10 See, e.g., EM Sec. 8.2. 
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E. ± 


Fig. 1.3. Electromagnetic wave radiation at 
system’s transition between its two quantized 
energy levels. 


Bohr showed that the correct 11 expression for the levels (relative to the free electron energy), 

( 1 . 8 ) 


and the correct value of the so-called Hartree energy 



m 

E h =2 ho, = — 
n 


f 2 X 


4 ns. 


27.2 eV , 


(1.9) 


0 7 


(where e « 1.602xl0" 19 C is the fundamental electric charge, and m e ~ 0.91 lxl0'' ,u kg is electron’s rest 
mass) could be obtained, with a virtually one-line calculation, from the classical mechanics plus just one 
additional postulate, equivalent to the assumption that the angular momentum L = m e vr of the electron 
moving on a circular trajectory of radius r about hydrogen’s nuclei (i.e. proton, assumed to stay at rest), 
is quantized as 


-30 


L = fin , 


( 1 . 10 ) 


where ft is again the same Plank’s constant (4), and n is an integer. Indeed, in order to derive Eq. (8), it 


is sufficient to solve Eq. (10) together with the 2 nd Newton’s law for the rotating electron, 




4 ns Q r' 


( 1 . 11 ) 


for the electron velocity v and radius r, and then plug the results into the nonrelativistic expression for 
the full electron’s energy 


E = 


my 


4 jts a r 


( 1 . 12 ) 


(This nonrelativistic approach to the problem is justified a posteriori by the fact the relevant energy 
scale Eu is much smaller than electron’s rest energy, m e c ~ 0.5 MeV.) By the way, the value of r, 
corresponding to n = 1, i.e. to the smallest possible electron orbit, 


(1.13) 



1 1 Besides very small corrections due to the finite ratio of the electron mass m e to that of the nuclei, and minor 
spin-orbital and relativistic effects - see Secs. 6.3 and 9.7 below. 

12 Unfortunately, another mane, “Rydberg constant” is also frequently used for either this atomic energy unit or 
its half, E h /2 « 13.6 eV. To add to the confusion, the same term “Rydberg constant” is sometimes used for the 
reciprocal free-space wavelength (l/2o = cod 2m) corresponding to frequency ox, = E u /2fi. 
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and called the Bohr radius, defines the most important spatial scale of phenomena in atomic, molecular 
and condensed matter physics - as well as in chemistry and biochemistry. 


Now note that the quantization postulate (10) may be presented as the condition than an integer 
number (n) of certain waves 13 fits the circular orbit’s perimeter 2 nr = nX. Dividing both parts of this 
relation by X, we see that for this statement to be true, the wave number k = 2 n!X of the (then hypothetic) 
de Broglie waves should be proportional to electron’s momentum p = mv: 


Momentum 
vs wave 
number 


p =hk . 


(1.14) 


(iv) The Compton effect 14 is the reduction of frequency of X-rays at their scattering on free (or 
nearly-free) electrons - see Fig. 4. 



The effect may be explained assuming that the X-ray photon also has a momentum that obeys the 
vector-generalized version of Eq. (14): 


. , hco 

P photon = ^ = 11 > 

c 

(1.15) 

where k is the wavevector (whose magnitude is equal to the wave number k, and direction coincides 
with that, n, of the wave propagation), and that momenta p of both the photon and the electron are 

related to their energies E by the classical relativistic formula 15 


E 2 = ( cp ) 2 + (me 2 ) 2 . 

(1.16) 

(For a photon, the rest energy is zero, and this relation is reduced to Eq. (5): E = cp 
a straightforward solution of the following system of three equations, 

= chk = hco.) Indeed, 

ha> + m e c 2 = hco' + [(cp) 2 + (m e c 2 ) 2 \ n , 

(1.17) 

hco hco’ 

— = cos# + p cos cp , 

c c 

(1.18) 

n hco' 

0 = sin # - sin , 

(1.19) 


c 


13 This fact was noticed and discussed in detail in 1923 by L. de Broglie, so that instead of discussing 
wavefunctions, especially of free particles, we are still frequently speaking of de Broglie waves. 

14 This effect was observed (in 1922) and explained a year later by A. Compton. 

15 See, e.g., EM Sec. 9.3. 
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(which describe, respectively, the conservation of the full energy of the photon-electron system, and of 
two relevant Cartesian components of its full momentum, at the scattering event - see Fig. 4), yields the 
following result, 

— - — = — - — l — — r- (1 — cos f?) , (1.20a) 

Tico' fico m e c 


which is traditionally represented as the relation between the initial and final values of photon’s 
wavelength X = 2n!k = 2nl{colc)\ 


2nh 

X' = X-t (1 - cos#) = X + X c (1 — cos 0\ 

m e c 


with X c = 


2 7th 
m e c ’ 


(1.20b) 


Compton 

effect 


and is in agreement with experiment. 16 


(v) De Broglie wave diffraction . In 1927, following the suggestion by W. Elassger (who was 
excited by de Broglie’s conjecture of “matter waves”), C. Davisson and L. Geriner, and independently 
G. Thomson succeeded to observe diffraction of electrons on crystals (Fig. 5). Specifically, they have 
found that the intensity of the elastic reflection from a crystal increases sharply when angle 6 between 
the incident beam of electrons and crystal’s atomic planes, separated by distance d, satisfies the 
following relation: 


2d sin 6 = nX , 


(1.21) 


Bragg 

condition 


where X = 2nlk = 2nhlp is the de Broglie wavelength of electrons, and n is an integer. As Fig. 5 shows, 
this is just the well- kn own condition 17 that the optical path difference A / = 2dsm6 between the de 
Broglie waves reflected from two adjacent crystal planes coincides with an integer number of X, i.e. of 
the constructive interference of the waves. 18 



Fig. 1.5. Electron scattering from a crystal 
lattice. 


16 The constant X c , which participates in this relation, is close to 2.46x1 O' 12 m and is called the Compton 
wavelength of the electron. This term is somewhat misleading: as the reader can see from Eqs. (1 7)-(l 9), no wave 
in the Compton problem has such a wavelength - either before or after the scattering. 

17 Frequently called the Bragg condition, due to the pioneering experiments by W. Bragg with X-ray scattering 
from crystals (that started in 1912). 

18 Later, spectacular experiments with diffraction and interference of heavier particles, e.g., neutrons and even C 6 o 
molecules, have also been performed - see, e.g., a review by A. Zeilinger et al., Rev. Mod. Phys. 60 , 1067 (1988) 
and a later publication by O. Nairz et al.. Am. J. Phys. 71 , 319 (2003). Nowadays, such interference of heavy 
particles is used for ultrasensitive measurements of gravity - see, e.g., a popular review by M. Amdt, Phys. Today 
67 , 30 (May 2014), and recent advanced experiments by P. Flamilton et al., Phys. Rev. Lett. 114 , 100405 (2015). 
Moreover, quantum interference between different parts and different quantum states of such macroscopic objects 
as superconducting condensates of millions Cooper pairs has been observed - see Sec. 3.1 below for details. 
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To summarize, all the listed effects may be explained starting from two very simple (and 
similarly looking) formulas: Eq. (5) for photons, and Eq. (15) for both photons and electrons - both 
relations involving the same Planck’s constant. This might give an impression of sufficient experimental 
evidence to declare light consisting of discrete particles (photons), and, on the contrary, electrons being 
some “matter waves” rather than particles. However, by that time (the mid 1920s) physics has 
accumulated overwhelming evidence of wave properties of light, such as interference and diffraction. In 
addition, there was also a strong evidence for lumped-particle (“corpuscular”) behavior of electrons. It is 
sufficient to mention the famous oil-drop experiments by R. Millikan and H. Fletcher (1909-1913) in 
that only single (and whole!) electrons could be added to an oil drop, changing its total electric charge 
by multiples of electron’s charge (- e ) - and never its fraction. It was apparently impossible to reconcile 
these observations with a purely wave picture, in which an electron and hence its charge need to be 
spread over the wave, so that its arbitrary part of it could be cut out using appropriate experimental 
setups. 

Thus the founding fathers of quantum mechanics faced a fonnidable task of reconciling the wave 
and corpuscular properties of electrons and photons - and other particles. The decisive breakthrough in 
that task has been achieved in 1926 by Ervin Schrodinger and Max Bom who formulated what is now 
known as either the Schrodinger picture of nonrelativistic quantum mechanics in the coordinate 
representation, or simply as wave mechanics. I will now formulate that picture, somewhat disregarding 
the actual history of its development. 


1.2. Wave mechanics postulates 

Let us consider a spinless, 19 nonrelativistic point-like particle whose classical dynamics may be 
described by a certain Hamiltonian function H{ r, p, t), 20 where r is particle’s radius-vector and p is 
coordinate. 21 Wave mechanics of such Hamiltonian particles may based on the following set of 
postulates 22 that are comfortingly elegant - though their final justification is given only by the agreement 
of all their corollaries with experiment. 

(i) Wavefunction and probability . Such variables as r or p cannot be always measured exactly, 
even at “perfect conditions” when all external uncertainties, including measurement instrument 
imperfection, macroscopic fluctuations of the initial state preparation, and unintended particle 
interactions with its environment, have been removed. 23 Moreover, r and p of the same particle can 


19 Actually, in wave mechanics, the spin of the described particle has not to be equal zero. Rather, it is assumed 
that the spin effects are negligible - as they are, for example, for a nonrelativistic electron moving in a region 
without an appreciable magnetic field. 

20 As a reminder, for many systems (including those whose kinetic energy is a quadratic-homogeneous function of 
generalized velocities, like mv 2 /2), H coincides with the total energy A - see, e.g., CM Sec. 2.3. 

21 Note that this restriction is very important. In particular, it excludes from our current discussion the particles 
whose interaction with environment is irreversible, for example it is the viscosity leading to particle’s energy 
decay. Such systems need a more general quantum-mechanical description that will be discussed in Chapter 7. 

22 Generally, quantum mechanics, as any theory, may be built on different sets of postulates (“axioms”) leading to 
the same conclusions. In this text, I will not try to beat down the number of postulates to the absolute minimum, 
not only because this would require longer argumentation, but chiefly because such attempts typically result in 
making certain implicit assumptions hidden from the reader - the practice as common as regrettable. 

23 I will imply such perfect conditions until the discussion of particle’s interaction with environment, and realistic 
(“physical”) measurements in Chapter 7. 


Chapter 1 


Page 7 of 26 





Essential Graduate Physics 


QM: Quantum Mechanics 


never be measured exactly simultaneously. Instead, even the most detailed description of the particle’s 
state, allowed by Nature, 24 is given by a certain complex function Hfir, t), called the wavefunction, that 
generally enables only probabilistic predictions of measured values of r, p, and other directly 
measurable variables (in quantum mechanics, called observables ). 

Specifically, the probability dW of finding a particle inside an infinitesimal volume dV = ar is 
proportional to this volume and may be characterized by the probability density w = dW/d 3 r that in turn 
is related to the wavefunction as 


w ■ 


= |'F(r,0f =y*(r,0Y(r,0, 


(1.22a) 


where sign * means the complex conjugate. As a result, the total probability of finding the particle 
somewhere inside a volume V may be calculated as 


W = J wd V = J ¥ x Yd d . 

V V 


(1.22b) 


In particular, if the volume V contains the particle definitely (i.e. with the 100% probability, W = 1), Eq. 
(22b) is reduced to the so-called normalization condition 



(1.22c) 


(ii) Observables and operators . To each observable A, quantum mechanics associates a certain 
linear operator A , such that, in the perfect conditions mentioned above, the average measured value 
(also called the expectation value) of A is expressed as 25 



(1.23) 


where (...) means the statistical average, i.e. the result of averaging the measurement results over a large 
ensemble (set) of macroscopically similar experiments, and T' is the normalized wavefunction - see Eq. 
(22c). For Eqs. (22) and (23) to be compatible, the identity (“unit”) operator I , defined by relation 


E¥ = W, 


(1.24) 


has to be associated with a particular type of measurement, namely with particle’s detection. 


(iii) Hamiltonian operator and the Schrodinger equation . Another particular operator, the 

Hamiltonian H , whose observable is the particle’s energy E, also plays in wave mechanics a very 
special role, because it participates in the Schrodinger equation, 



(1.25) 


24 This is one more important caveat. As we will see in Chapter 7, in many cases even the Hamiltonian particles 
cannot be described by a certain wavefunction, and allow only a more general (and less precise) description, e.g., 
by the density matrix. 

25 This key measurement postulate is sometimes called the Born rule. 
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that determines wavefunction’s dynamics, i.e. its time evolution. 


Operators of 
coordinate and 
momentum 


(iv) Radius-vector and momentum operators . In the coordinate representation accepted in wave 
mechanics, the (vector) operator of particle’s radius-vector r just multiples the wavefunction by this 
vector, while the operator of particle’s momentum 26 is represented by the spatial derivative: 


p = -m , 

where V is the del (or “nabla”) vector operator. 27 Thus in the Cartesian coordinates, 


(1.26a) 


/ \ - J 8 8 8 [ 

r = r = {x,y,z), p = 

I ox cry ozj 


(1.26b) 


(v) Correspondence principle . In the limit when quantum effects are insignificant, e.g., when the 
characteristic scale of action S 28 (i.e. the product of the relevant energy and time scales of the problem) 
is much larger than Planck’s constant Pi, all wave mechanics results have to tend to those given by 
classical mechanics. Mathematically, the correspondence is achieved by duplicating the classical 
relations between observables by similar relations between the corresponding operators. For example, 
for a free particle, the Hamiltonian (that in this case corresponds to the kinetic energy alone) has the 
form 



Free 

particle’s $o that, taking into account Eq. (26b), in the Cartesian coordinates, 

Hamiltonian 



(1.27a) 


(1.27b) 


Even before a discussion of physics of the postulates (offered in the next section), we may 
immediately see that they indeed provide a way toward the resolution of the apparent contradiction 
between the wave and corpuscular properties of particles. For a free particle, the Schrodinger equation 
(25), with the substitution of Eq. (27), takes the form 


Free 
particle’s 
Schrodinger 
equation 

whose particular (but most important) solution is a plane, monochromatic wave, 29 

Plane 
wave 
solution 


T^r,/) = ae 


/(k-r-ot) 



(1.28) 


(1.29) 


26 For an electrically charged particle in magnetic field, this relation is valid for its canonical momentum - see 
Sec. 3.1 below. 

27 See, e.g., Secs. 8-10 of the Selected Mathematical Formulas appendix (below, referred to as MA). Note that 
according to these formulas, the del operator follows all the geometric rules of the usual (c-number) vectors. This 
is, by definition, true for other vector operators of quantum mechanics to be discussed below. 

28 See, e.g., CM Sec. 10.3. 

29 See, e.g., CM Sec. 7.7 and/or EM Sec. 7.1. 
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where a, k and co are constants. Indeed, plugging Eq. (29) into Eq. (28), we immediately see the plane 
wave, with an arbitrary amplitude a, is indeed a solution of the Schrodinger equation, provided a 
specific dispersion relation between wavevector k and frequency co: 


fico = 


(hk) 2 
2 m 


(1.30) 


Constant a may be calculated, for example, assuming that solution (29) is extended over a certain 
volume V, while beyond it, V F = 0. Then from the normalization condition (22c) and Eq. (29), we get 30 


\a\ 2 V = l. 


(1.31) 


Now we can use Eqs. (23), (26) and (27) to calculate the expectation value of particle’s 
momentum p and energy E (which, for a free particle, coincides with its Hamiltonian function H), The 
result is 


(p) = , 


( E ) = ( H ) 


(hk) 2 

2m 


(1.32) 


according to Eq. (30), the last equality may be rewritten as (E) = fico. 


Next, Eq. (23) enables one to calculate not only the statistical average (in the math speak, the 
first moment) of an observable, but also its higher moments, notably the second moment (in physics, 
usually called either the variance or dispersion): 


(A').{{A-(A)y)-(A')-(A)\ 


(1.33) 


and hence its root mean square ( r.m.s .) fluctuation. 



(1.34) 


that characterizes the scale of deviations A = A - (A) of measurement results from the average, i.e. the 

uncertainty of observable A. In application to wavefunction (29), these relations yield 8E = 0, <5p = 0, 
while the particle coordinate r (at V — » oo) is completely uncertain. This means that in the plane -wave, 
monochromatic state (29), the energy and momentum of the particle are exactly defined, so that the 
signs of statistical average in Eqs. (32) might be removed. Thus, these relations are reduced to the 
experimentally-inferred Eqs. (5) and (15), though the relation of frequency co of wavefunction’ s 
evolution in time to experimental observations still has to be clarified. 


Hence the wave mechanics postulates may indeed explain the observed wave properties of 
nonrelativistic particles. (For photons, we would need a relativistic formalism - see Ch. 9 below.) On 
the other hand, due to the linearity of the Schrodinger equation (25), any sum of its solutions is also a 
solution - the so-called linear superposition principle. For a free particle, this means that a set of plane 
waves (29) is also a solution of this equation. Such sets, with close values of k and hence p = hk (and, 
according to Eq. (30), of co as well), may be used to describe spatially localized “pulses”, called wave 
packets — see Fig. 6. In Sec. 2.1, I will prove (or rather reproduce H. Weyl’s proof :-) that the wave 


30 For infinite space (V — > oo), Eq. (31) yields a — > 0, i.e. wavefunction (29) vanishes. This formal problem may be 
readily resolved considering sufficiently long wave packets - see Sec. 2.2 below. 


Free 

particle’s 

dispersion 

relation 


Observable’s 

variance 


Observable’s 
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packet extension Sc in any direction (say, x) is related to the width 5k x of the corresponding component 
of its wave vector distribution as SxSk x > V%, and hence, according to Eq. (15), to the width Sp x of the 
momentum component distribution as 

Heisenberg’s 
uncertainty 
relation 



(1.35) 


(a) 

Sc 




Fig. 1.6. (a) Snapshot of a typical wave packet 
propagating along axis x, and (b) the corresponding 
distribution of wave numbers k x , i.e. momenta p x . 


This is the famous the famous Heisenberg’s uncertainty principle, which quantifies the first 
postulate’s point that coordinate and momentum cannot be defined exactly simultaneously. However, 
since the Planck’s constant is extremely small on the human scale of things, it still allows for the 
particle’s localization in a very small volume even if the momentum spread in the wave packet is also 
small on that scale. For example, according to Eq. (35), a 0.1% spread of momentum of a 1 keV electron 
{p ~ 1.7xl0' 24 kg-m/s) allows a wave packet to be as small as ~3xlO" 10 m. (For a heavier particle such as 
a proton, the packet would be even tighter.) As a result, wave packets may be used to describe particles 
that are point-like from the macroscopic point of view. 

In a nutshell, this is the main idea of the wave mechanics, and the first part of this course 
(Chapters 1-3) will be essentially a discussion of various manifestations of this approach. During this 
discussion, we will not only evidence wave mechanics’ many triumphs within its applicability domain, 
but will also gradually accumulate evidence for its handicaps, which force the eventual transfer to a 
more general formalism - to be discussed in Chapter 4 and beyond. 


1.3. Postulates’ discussion 

The postulates listed in the previous section look very simple, and they are hopefully familiar to 
the reader from his or her undergraduate studies. However, the physics of these axioms are very deep, 
they lead to several counter-intuitive conclusions, and their in-depth discussion requires solutions of 
several key problems using these axioms. This is why in this section I will give only an initial, 
admittedly superficial discussion of the postulates, and will be repeatedly returning to the conceptual 
foundations of quantum mechanics throughout the course, especially in Secs. 7.7, 10.1, and 10.2. 

First of all, the fundamental uncertainty of observables, which is in the core of postulate (i), is 
very foreign to the basic ideas of classical mechanics, and historically has made quantum mechanics so 
hard to swallow for many star physicists, notably including A. Einstein - despite his 1905 work which 
essentially launched the whole field! However, this fact has been confirmed by numerous experiments, 
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and (more importantly) there have not been a single confirmed experiment which would contradict to 
this postulate, so that quantum mechanics was long ago promoted from a theoretical hypothesis to the 
rank of a reliable scientific theory. 

One more remark in this context is that Eq. (25) itself is deterministic, i.e. conceptually enables 
an exact calculation of wavefunction’s distribution in space at any instant t, provided that its initial 
distribution, and particle’s Hamiltonian, are known exactly. In classical kinetics, the probability density 
distribution w(r,t) may be also calculated from deterministic differential equations, e.g., the Fokker- 
Planck equation or the Boltzmann equation. 31 The quantum-mechanical description differs from those 
situations in two important aspects. First, in the perfect conditions outlined above (exact initial state 
preparation, no irreversible interaction with environment, the best possible measurement), the Fokker- 
Planck equation reduces to the 2 nd Newton law, i.e. the statistical uncertainty disappears. In quantum 
mechanics this is not true: the quantum uncertainly, such as Eq. (35), persists even in this limit. Second, 
the wavefunction l P(r, t) gives more information than just w( r, t ), because besides the modulus of 'P, 
involved in Eq. (22), this complex function also has phase cp = arg'P, and may affect some observables, 
describing, in particular, the interference and diffraction of the de Broglie waves. 


Next, it is very important to understand that the relation between the quantum mechanics to 
experiment, given by postulate (ii), necessarily involves another key notion: that of the corresponding 
statistical ensemble. Such ensemble may be defined as a set of many experiments carried out at 
apparently (macros copically) similar conditions, which nevertheless may lead to different measurement 
results (outcomes). Indeed, the probability of a certain (/7-th ) outcome of an experiment may be only 
defined for a certain ensemble, as the limit 


W„ = lim 


M — >oo 


M 


N 

with M = JX s 

n = 1 


(1.36) 


where M is the total number of experiments, M„ is the number of outcomes of the /7-th type, and N is the 
number of different outcomes. It is clear that a particular choice of an ensemble may affect probabilities 
W n very significantly. 


For example, if we pull out playing cards at random from a pack of 52 different cards of 4 suits, 
the probability W„ of getting a certain card (e.g., the queen of spades) is 1/52. However, if cards of a 
certain suit (say, hearts) had been taken out from the pack in advance, the probability of getting the 
queen of spades is higher, 1/39. It is important that we would also get the last number for probability 
even if we had used the full 52-card pack, but by some reason ignored results of all experiments giving 
us any rank of hearts. 


Similarly, in quantum mechanics, the probability distributions (and hence expectation values of 
particle coordinate and other observables) depend not only on the experiment setup, but also on the set 
of outcomes we count. Because of the fundamental relation (22) between w and 'F, this means the 
wavefunction also depends on those factors, i.e. on both the experiment set preparation and the subset of 
outcomes taken into account. The insistence on the attribution of the wavefunction to a single 
experiment, both before and after the measurement, may lead to very unphysical interpretations of some 
experiments, including wavefunction’s evolution not described by the Schrodinger equation (the so- 
called wave packet reduction), subluminal action on distance, etc. Later in the course we will see that 
minding the statistical nature of the quantum mechanics, and in particular the dependence of the 


31 See, e.g., SM Secs. 5.8 and 6.2, respectively. 
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wavefunction on statistical ensemble’s specification, may readily explain some apparent paradoxes of 
quantum measurements. 

Let me also emphasize that statistics is intimately related to the information theory - and not only 
via their common mathematical background, the probability theory. For example, the question, “What 
subset of experimental results we will count?” may be replaced by the question, “What subset of results 
will we use information about?” As a result, the reader has to be prepared to the use of information 
theory notions for the discussion quantum mechanics, or at least its relation to experiment - i.e. to the 
“physical reality”. This feature of quantum mechanics makes some physicists uncomfortable, because 
much of classical mechanics and electrodynamics may be discussed without any reference to 
in formation. In quantum mechanics (as in statistical mechanics), such an abstraction is impossible. 


Proceeding to postulate (ii) and in particular Eq. (23), a better feeling of this definition may be 
obtained by its comparison with the general definition of the expectation value (i.e. the statistical 
average) in the probability theory. Namely, let each of N possible outcomes in a set of M 
macroscopically similar experiments give a certain value A n of observable A; then 


Definition 
of statistical 
average 


(A) = lim M 


M 


±AM n =±A n W n . 


n = 1 


n = 1 


(1.37) 


Taking into account Eq. (22), which relates W and ¥, the structure of Eq. (23) and the final fonn of Eq. 
(37) is similar. Their exact relation will be further discussed in Sec. 4. 1 . 


1.4, Continuity equation 

The wave mechanics postulates survive one more sanity check: they satisfy the natural 
requirement that the particle does not appear or vanish in the course of the quantum evolution. 32 Indeed, 
let us use Eq. (22) to calculate the rate of change of the probability W to find the particle within a certain 
volume V: 


dt dt • 


(1.38) 


Assuming for simplicity that the boundaries of volume V do not move, it is sufficient to carry out the 
partial differentiation of the product V F V F* inside the integral. Using the time-dependent Schrodinger 
equation (25), together with its complex conjugate, 

a'P* - * 

-ih— = (HV) , (1.39) 

dt 


we get 


dW 

dt 



dA> 

+ 

dt 


'P 


* 

d*P 

dt 




d 3 r = — f 

* / ~ \ ( ~ \* ~ 

'P //¥ -'P /PP 

ifi I 

J 1 V 

\ / \ / 


(1.40) 


32 Note that this requirement is not extended to the relativistic quantum theory - see Chapter 9 below. 
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Let the particle move in a field of external forces (not necessarily constant in time), so that its 
classical Hamiltonian function His a sum of particle’s kinetic energy p 1 t2m and its potential energy U( r, 
f). 33 According to the correspondence principle, the Hamiltonian operator may be presented as the sum 34 ’ 



(1.41) 


At this stage we should notice that such operator, when acting on a real function, returns a real 
function. 35 Hence, the result of its action on an arbitrary complex function V F = a + ib (where a and b are 
real) is 


m> = H (a + ib) = Ha + iHb , 


(1.42) 


where Ha and Hb are also real, while 


(m>) = (Ha + iHb)* = Ha- iHb = H(a - ib) = m * * . 


This means that Eq. (40) may be rewritten as 
dW 1 


r 

/V A 


4 s n '■) 4 s 


T* IbY - VlbY 

vj/ v 'P-'FV ¥ 

j 

V 


2 in ih J v 

- 


dt ifi • 

Now, let us use general rules of vector calculus 36 to write the following identity: 


d 2 r . 


V I t Vf-T'VT'* ) = 'T*V 2 T'-T'V 2 'T* 


A comparison of Eqs. (44) and (45) shows that we may write 

dW 


dt 


- J (V ■ j )d 3 r. 


where vector j is defined as 


j = — f'FV l P*-c.c.l = — hn[ f VT |, 

2m V J m 


(1.43) 


(1.44) 


(1.45) 


(1.46) 


(1.47) 


where c.c. means the complex conjugate of the previous expression - in this case, ( V FV V F*)*, i.e. V F !|S V V P. 
Now using the well-known divergence theorem, 37 Eq. (46) may be rewritten as the continuity equation 

(1.48) 



33 As a reminder, such description is valid not only for potential forces (in that case U has to be time- 
independent), but also for any force F(r, t) which may be presented via the gradient of U(r, t) - see, e.g., CM 
Chapters 2 and 10. (A good example when such a description is impossible is given by the magnetic component 
of the Lorentz force - see, e.g., EM Sec. 9.7, and also Sec. 3.1 of this course.) 

34 Historically, this was the main step made (in 1926) by E. Schrodinger on the background of L. de Broglie’s 
idea. The probabilistic interpretation of the wavefunction was put forward, almost simultaneously, by M. Bom. 

35 In Chapter 4, we will discuss a more general family of Hermitian operators, which have this property. 

36 See, e.g., MA Eq. (1 1.4a), combined with the del operator’s definition V 2 = V V. 

37 See, e.g., MAEq. (12.2). 
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Continuity 

equation: 

differential 

form 


where j n is the projection of vector j on the outwardly directed normal to surface S that limits volume V, 
i.e. the scalar product j-n, where n is the unit vector along this normal. 

Equations (47) and (48) show that if the wavefunction on the surface vanishes, the total 
probability W of finding the particle within the volume does not change, providing the required sanity 
check. In the general case, Eq. (48) says that dW/dt equals to flux / of vector j through the surface, with 
the minus sign. It is clear that this vector may be interpreted as the probability current density - and /, as 
the total probability current through surface S. This interpretation may be further supported by rewriting 
Eq. (47) for a wavefunction presented in the polar form 'F = ae ,<p , with real a and (p\ 

\ = a 2 -V(p, (1.49) 

m 

- evidently a real quantity. Note that for a real wavefunction, or even for that with an arbitrary but space- 
constant phase (p, the probability current density vanishes. On the contrary, for the traveling wave (29), 
with a constant probability density w = a , Eq. (49) yields a nonvanishing (and physically very 
transparent) result: 

h p 

j = w — k = w — = w\, (1.50) 

m m 


where v = p/m is particle’s velocity. If multiplied by the particle’s mass m, the probability density w 
turns into the (average) mass density p, and the probability current density into the mass flux density pv, 
while if multiplied by the total electric charge q of the particle, with w turning into the charge density a, 
j becomes the electric current density, both satisfying the classical continuity equations similar to Eq. 
(48). 38 


Finally, let us recast the continuity equation, rewriting Eq. (46) as 

dw ^ 


f — + V • j 

\dt 


d r = 0 . 


(1.51) 


Now we may argue that this equality may is true for any choice of volume V only if the expression 
under the integral vanishes everywhere, i.e. if 


dw 

dt 


+ V-j = 0. 


(1.52) 


This differential form of the continuity equation is sometimes more convenient than its integral form 
(48). 


1.5. Eigenstates and eigenvalues 

Now let us discuss important corollaries of wave mechanics’ linearity. First of all, it uses only 
linear operators. This term means that the operators must obey the following two rules: 39 


38 See, e.g., respectively, CM 7.2 and EM Sec. 4.1. 

39 By the way, if any equality involving operators is valid for an arbitrary wavefunction, the latter is frequently 
dropped from notation, resulting in an operator equality. In particular, Eq. (53) may be readily used to prove that 
the operators are commutative'. A 2 + A, = T, + A 2 , and associative', (ji, + A 2 )+ A 3 = A, + [A 2 + A 3 ). 
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(d, +A 2 }'¥ = A 1 '¥ + A 2 x V, (1.53) 

+ c 2 W 2 )=A(c l W l )+A(c 2 W 2 ) = c 1 AW 1 +c 2 AW 2 , (1.54) 

where Hf, are arbitrary wavefunctions, while c n are arbitrary constants (in quantum mechanics, 
frequently called c-numbers, to distinguish them from operators and wavefunctions). Most important 
examples of linear operators are given by: 

(i) the multiplication by a function, such as for operator r in wave mechanics, and 

(ii) the spatial or temporal differentiation of the wavefunction, such as in Eqs. (25)-(27). 

Next, it is of key importance that the Schrodinger equation (25) is also linear. (We have already 
used this fact when we discussed wave packets in the last section.) This means that if each of functions 
are (particular) solutions of Eq. (25) with a certain Hamiltonian, then an arbitrary linear combination 

r=2>,x (i,55) 


is also a solution of the same equation. 40 


Now let us use the linearity of wave mechanics to accomplish an apparently impossible feat: 
immediately find the general solution to the Schrodinger equation for the most important case when 
system’s Hamiltonian does not depend on time explicitly - for example, like in Eq. (27), or in Eq. (41) 
with time-independent U= U{ r). First of all, let us prove that the following product, 


^,=T„(^ n ( r), 


(1.56) 


Variable 

separation 


qualifies as a (particular) solution to the Schrodinger equation. Indeed, plugging Eq. (56) into Eq. (25), 
using the fact that for a time-independent Hamiltonian 


HT n (J t)y/ n (r) = T n (t)H y/ n (r) , 


(1.57) 


and dividing both parts of the equation by V F„ = T„ i// n , we get 

mf n _ Hy/ n 

T n ¥n 


(1.58) 


where (here and below) the dot denotes the differentiation over time. The left hand side of this equation 
may depend only on time, while the right hand one, only on coordinates. These facts may be only 
reconciled if we assume that each of these parts is equal to (the same) constant of the dimension of 
energy, which I will denote as E n . 41 As a result, we are getting two separate equations for the temporal 
and spatial parts of the wavefunction: 

MT n =E n T n , (1.59) 


40 It may seem strange that the linear Schrodinger equation correctly describes quantum properties of systems 
whose classical dynamics is described by nonlinear equations of motion (e.g., an anharmonic oscillator - see, e.g., 
CM Sec. 4.2). Note, however, that equations of classical physical kinetics (see, e.g., SM Chapter 6) also have this 
property, so it is not specific to quantum mechanics. 

41 This argumentation, leading to variable separation, is very common in mathematical physics - see, e.g., its 
discussion in EM Sec. 2.5. 
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HWn = E nVn • 

The first of these equations is readily integrable, giving 


(1.60) 


T 

n 


const x exp {- ico n t\ 


with co n 



(1.61) 


and thus substantiating the fundamental relation (5) between energy and frequency. Plugging Eqs. (56) 
and (61) into Eq. (22), we see that in such a state, the probability w of finding the particle at a certain 
location does not depend on time. Doing the same with Eq. (23) shows that the same is true for the 
expectation value of any operator that does not depend on time explicitly: 




r = const. 


(1.62) 


Due to this property, the states described by Eqs. (56), (60), and (61), are called stationary. In contrast 
to the simple and universal time dependence (61), the spatial distributions y/ n { r) of the stationary states 
are often hard to find, and the solution of the stationary (or “time-independent”) Schrodinger equation 
(60), 42 which describes the distributions, for various situations is a major focus of wave mechanics. 


The stationary Schrodinger equation (60), with time-independent Hamiltonian (41), 


+U(r) ¥n = E n y/ 
2m 


(1.63) 


falls into the mathematical category of linear eigenproblems , 43 in which eigenfunctions i//„ and 
eigenvalues E n should be found simultaneously - self-consistently 44 Mathematics tells us that for the 
such problems with space-confined eigenfunctions y/ n , tending to zero at r — » oo, the spectrum of 
eigenvalues is discrete. It also proves that the eigenfunctions corresponding to different eigenvalues are 
orthogonal, i.e. that space integrals of the products y/ n yj* n - vanish for all pairs with n ^ n Moreover, 
due to the Schrodinger equation linearity, each of these functions may be multiplied by a constant 
coefficient to make this set orthonormal'. 



J 1, if n = n', 
[0, if n ^ ri. 


(1.64) 


Also, the eigenfunctions form a full set, meaning that an arbitrary function yf r), in particular the actual 
wavefunction V P of the system in the initial moment of its evolution (which I will take for t = 0, with a 
few exceptions), may be presented as a unique expansion over the eigenfunction set: 

T(r,0) = Xc#„(r). (1.65) 

n 

The expansion coefficients Ck may be readily found by multiplying both parts of Eq. (65) by y/* n ’, 
integrating the result over the space, and using Eq. (64). The result is 


42 In contrast, the initial Eq. (24) is frequently called the time-dependent or nonstationary Schrodinger equation. 

43 From German root eigen meaning “particular” or “characteristic”. 

44 Eigenvalues of energy are frequently called eigenenergies, and it is often said that eigenfunction i//„ and 
eigenenergy E„ together characterize /7-th stationary eigenstate of the system. 
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c n = \v* n ( r )'¥(?$)d 1 ’r . ( 1 . 66 ) 

Now let us consider the following wavefunction 

^(M) = YjC n a k (t)y/ k (r) = £c B ^ B (r)expj-i^fj . (1.67) 

Since each tenn of the sum has the fonn (56) and satisfies the Schrodinger equation, so does the sum as 
the whole. Moreover, if coefficients c„ are derived in accordance with Eq. (66), then solution (67) 
satisfies the initial conditions as well. At this moment we can again use one more help by 
mathematicians who tell us that the partial differential equation of type (28) with the Hamiltonian 
operator (41) with fixed initial conditions, may have only one ( unique ) solution. This means that in our 
case of motion in a time-independent potential U— U{ r), Eq. (67) gives the general solution of the time- 
dependent Schrodinger equation (25) for our case: 

Pi 2 

ih - — = — — V 2v F + [/(r) x F . (1.68) 

dt 2 m 

We will repeatedly use this key fact through the course, though in many cases, following the physical 
sense of particular problems, will be more interested in certain specific particular solutions of Eq. (68) 
rather in the whole linear superposition (67). 

In order to get some feeling of functions y/„, let us consider perhaps the simplest example, which 
nevertheless will be the basis for discussion of many less trivial problems: a particle confined in a 
rectangular quantum vvc7/ 45 with a flat “bottom” and sharp and infinitely high “hard walls”: 

f 0, for0<x<a,., 0<v<a„, and0<z<a„, 

U(r) = \ x y y z (1.69) 

[ + qo, otherwise. 


The only way to keep the product Ui//„ in Eq. (68) finite outside the well, is to have \|/ = 0 in these 
regions. Also, the function have to be continuous everywhere, to avoid the divergence of its Laplace 
operator. Hence, we may solve the stationary Schrodinger equation (63) only inside the well, where it 
takes a simple form 46 


-|^V„=£„r i 

2 m 


(1.70a) 


with zero boundary conditions on all the walls. For our particular geometry, it is natural to express the 
Laplace operator in the Cartesian coordinates {x, y, zj aligned with the well sides, so that we get the 
following boundary problem : 


45 By using the term “quantum well” for what is essentially a potential well 1 bow to a common, but a very 
unfortunate convention. Indeed, this term seems to imply that the particle’s confinement in such a “quantum well” 
is a phenomenon specific for quantum mechanics, while as we will repeatedly see in this course, that the opposite 
is true: quantum effects do as much as they only can to overcome particle’s confinement in a potential well, 
letting the particle to partly penetrate in the “classically forbidden” regions. 

46 Rewritten as V 2 /+ k 2 f= 0, this is the Helmholtz equation , which describes scalar waves of any nature (with 
wave vector k) in a uniform, linear media - see, e.g., CM Sec. 5.5 and/or EM Secs. 7. 7-7. 9. 
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ti 2 r a 2 


2m 


8 2 d 

+ — ^ + 


2 A 


ax 2 ay 2 dz z f n EnWn 5 
y/ n =0, for x = 0 and a x ; 


for 0 < x < a x , 0 < v < a y , and 0 < z < a z , 
y = 0 anda v ; z = 0 anda z . 


(1.70b) 


This problem may be readily solved using the same variable separation method which was used 
earlier in this section to separate the spatial and temporal variables, now to separate Cartesian spatial 
variables from each other. Let us look for a particular solution in the form 

W(r ) = X(x)Y(y)Z(z). (1.71) 


(It is convenient to postpone taking care of proper indices for a minute.) Plugging this expression into 
the Eq. (70b) and dividing by y/= XYZ, we get 


h 2 ( 1 d 2 X 1 d 2 Y 1 d 2 Z^ 
2m yX dx 2 Y dy 2 Z dz 2 , 


(1.72) 


Now let us repeat the standard argumentation of the variable separation method: since each term 
in the parentheses may be only a function of the corresponding argument, the equality is possible only if 
each term is a constant - with the dimensionality of energy. Calling them E x , etc., we get three ID 
equations 


ft 2 1 d 2 Z 

2m X dx 2 2m Y dy 2 ' ’ 2m Z dx 2 


with Eq. (72) turning into the energy-matching condition 

E x +E y +E,=E. 


(1.74) 


All three ordinary differential equations (73), and their solutions, are similar. For example, for 
X{x), we have a ID Helmholtz equation 


+ k 2 X = 0, with k 2 
dx 2x 


2mE x 

h 2 


(1.75) 


and simple boundary conditions: X(0) = X(a x ) = 0. Let me hope that the reader knows how to solve this 
well-known ID boundary problem - describing, for example, usual mechanical waves on a guitar string, 
though with a very much different expression for k x . The problem allows an infinite number of 
sinusoidal standing-wave solutions, 47 


(1.76) 

functions 

corresponding to eigenenergies 


rujciariyuiar 

quantum 

well: 

partial 

eigen- 


x = 


/ \ 1/2 
' 2 ' 


\ a ,J 


sin kx = 


/ \ 1/2 
' 2 A 


V a x ) 


JTfl X 

sin — — , with n x = 1 , 2 ,..., 


E 


X 



2+2 


TTfl 
2ma : 


2 77 2 

n = h ,n 

X x\ X 


(1.77) 


47 The front coefficient is selected in a way that ensures the (ortho)normality condition (64). 
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Figure 7 shows this result using a somewhat odd but very graphic and hence common way when the 
eigenenergy values (frequently called energy levels ) are used as horizontal axes for plotting 
eigenfunctions, despite their different dimensionality. 


Due to the similarity of all Eqs. (73), Y (y) and Z(z) are similar functions of their arguments, and 
may also be numbered by integers (say, n y and n z ) independent of n x , so that the spectrum of the total 
energy (74) is 

(1.78) 



Rectangular 
quantum well 
energy levels 



Fig. 1.7. Eigenfunctions (solid lines) and eigenvalues 
(dashed lines) of the ID wave equation (75) on a finite- 
length segment. Solid black lines show the potential 
energy profile of the problem. 


Thus, in this 3D problem, the role of index n in Eq. (67) is played by a set of 3 independent 
integers {n x , n y , n z }. In quantum mechanics, such integers play a key role, and thus have a special name, 
quantum numbers. Now the general solution (67) of our simple problem may be presented as the sum 


^(r,0= X 


H fl }l = I 

n x ’ n y ’ n z 1 


. m r x . xn Y y . mz 
'n ,n ,n sin sin sin exp 

- v y z a, a„ a. 


-i 


n x’ n y’ n z 


(1.79) 


Rectangular 

quantum 

well: 

general 

solution 


with the coefficients which may be readily calculated from the initial wavefunction l F(r, 0), using Eq. 
(66), again with the replacement n — > {n x , n y , n : ) . This simplest problem is a good illustration of the 
basic features of wave mechanics for a spatially-confined motion, including the discrete energy 
spectrum, and (in this case, evidently) orthogonal eigenfunctions. 


An example of the opposite limit of a continuous spectrum for unconfined motion of a free 
particle is given by plane waves (29) which, with the account of relations E = Tico and p = ftk, may be 
viewed as the product of the time-dependent factor (46) by eigenfunction 


Vv = exp{/k • r} 


(1.80) 


that is the solution to the stationary Schrodinger equation (70a) if it is valid in the whole space. 48 

The reader should not be worried too much by the fact that the fundamental solution (80) in free 
space is a traveling wave (having, in particular, nonvanishing value (50) of the probability current j), 


Free 

particle: 

eigen- 

functions 


48 In some systems (e.g., a particle interacting with a finite-depth quantum well), a discrete energy spectrum 
within a certain interval of energies may coexist with a continuous spectrum in a complementary interval. 
However, the conceptual philosophy of eigenfunctions and eigenvalues remains the same in this case as well. 
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while those inside a quantum well are standing waves, with j = 0, even though the free space may be 
legitimately considered as the ultimate limit of a quantum well with volume V = a x a v a- — » oo. Indeed, 
due to the linearity of wave mechanics, two traveling-wave solutions (80) with equal and opposite 
values of momentum (and hence with the same energy) may be readily combined to give a standing- 
wave solution, for example exp{zk-r} + exp{-zk-r} = 2cos(k-r), with the net current j = 0. Thus, 
depending on convenience for solution of a particular problem, we can present the general solution as a 
sum of either traveling-wave or standing-wave eigenfunctions. 

Since in the free space there are no boundary conditions to satisfy, Cartesian components of the 
wave vector k in Eq. (80) can take any real values. (This is why it is more convenient to label the 
wavefunctions and eigenenergies, 

Free 
particle: 
eigen- 
energies 

by their wave vector k rather than an integer index.) However, one aspect of systems with continuous 
spectrum requires a bit more math caution: summation (67) should be replaced by integration over a 
continuous index or indices (in this case, 3 components of vector k). The main rule of such replacement 
may be readily extracted from Eq. (76): according to this relation, for standing-wave solutions, the 
eigenvalues of k x are equidistant, i.e. separated by equal intervals A k x = nla x (with the similar relations 
for other two Cartesian components of vector k). Hence the number of different eigenvalues of the 
standing wave vector k (with k x , k y , k z > 0), within a volume d 3 k » 1/E of the k space is just dN = 
d 3 k/(Ak x Ak x Ak x ) = VI f. Since in continuum it is more convenient to work with traveling waves, we 
should take into account that, as was just discussed, there are two different traveling wave vectors (k 
and k’ = -k) corresponding to each standing wave vector k. Hence the same number of physically 
different states corresponds to 2 = 8-fold larger k space (which now is infinite in all directions) or, 
equivalently, to a smaller number of states per unit volume d 3 k: 

3D 
number 
of states 

For dN » 1, this expression is independent on the boundary conditions, 49 and is frequently 
presented as the following summation rule 



(1.82) 



(1.81) 


Summation 
over 
3D states 


li-Vr^o Z/(k) = \mdN = J/oorf’* , 


(1-83) 


where /(k) is an arbitrary function of k. This rule is very important for statistical physics. Note also that 
if the same wave vector k corresponds to several internal quantum states (such as spin - see Chapter 4), 
the right-hand part of Eq. (83) requires multiplication by the corresponding degeneracy factor. 


1.6. Dimensionality reduction 

To conclude this introductory chapter, let me discuss the conditions when the spatial 
dimensionality of a wave mechanics problem may be reduced. 50 For example, following our discussion 


49 For a more detailed discussion of this point, the reader may be referred, e.g., to CM Secs. 5.4 (in the context of 
ID mechanical waves), because it is valid for waves of any nature. 
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of the 3D rectangular, flat-bottom quantum well in Sec. 5, let us consider an infinitely deep quantum 
well whose bottom is flat only in one direction, say z: 


[U(x,y), for 0 <z<a : , 
[ + oo, otherwize. 


(1.84) 


In this case, we can separate variables only partly, by presenting the eigenfunction as yAxy)Z(z). 
Plugging such solution into the corresponding form of the stationary Schrodinger equation (63), we see 
that functions Z(z) are again similar to those given by Eq. (76), while function yAxy) satisfies the 
following 2D stationary Schrodinger equation: 


where 


ti 

- — V L ^ + u « (*> yw = E x j / > 

2m 


(1.85) 


2D 

stationary 

Schrodinger 

equation 


u* ( X , y) = U (x, y) + E : =U (x, y) + 


n 2 h 2 n: 
2 mal 


(1.86) 


Effective 

potential 

energy 


Thus, we have arrived at the boundary problem similar to the initial one, but with the spatial 
dimensionality reduced from 3 to 2, due to what is called the partial confinement 51 in direction z. If all 
partial functions Z(z) are normalized to unity, the wavefunction normalization condition (22c) becomes 

W = jy/(x, y)y/ * (x, y)dxdy , (1.87) 

A 


where A is the total area of the system on the [x, y] plane, and is formally similar to the initial 3D 
normalization condition. However, the effective 2D potential energy U e Ax,y) includes term E z depending 
on quantum number n z , 52 making the physical relevance of such variable separation much less general 
than might be naively expected. There are three possible cases: 

(i) If there is no strong relation between the energy scale E x& . of potential Ufixy) and E z , the 
solution of a typical problem has to be presented as a (typically, large) sum of partial solutions 
yAx,y)Z(z), each with its own n z , U e f, and E z . In this general case, the variable separation may not 
provide much relief at all, because eigenenergies of solutions with different n z may be close, so that 
several of them would simultaneously participate in realistic processes. 

(ii) E z is much smaller than E xy and may be neglected. This may be the case, for example, if the 
potential profile is more steep along axes x and v, than along direction z. Notice, however, that 
condition, a z — > oo, does not guarantee the smallness of E z , because it may be compensated by large 
values of n z . In this case (typical for solid state problems), either summation or integration over n, still 


50 Many textbooks on quantum mechanics jump to solution of ID without such discussion, and most of my 
beginning graduate students did not understand that in realistic physical systems, such dimensionality restriction 
is only possible under very specific conditions. 

51 The term “quantum confinement”, sometimes used to describe this phenomenon, is as unfortunate as the 
“quantum well”, because of the same reason: the confinement is a purely classical effect, and as we will 
repeatedly see in this course, quantum mechanics reduces it, allowing a partial penetration of the particle into the 
classically forbidden regions with E > U{ r). 

52 The last term in Eq. (86) is frequently referred to as the (partial) confinement energy ; despite its inclusion to 
U e f, it is important to remember about the kinetic- energy origin of this contribution. 
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may be needed, though sometimes may be carried out analytically, because functions Z(z) are simple 
sinusoidal waves. 

(iii) Counter-intuitively, the most robust dimensionality reduction is possible in the opposite 
limit, when a- is much smaller than the characteristic scale of motion within the [x, y] plane (Fig. 8a). 
Indeed, in this case the distance between adjacent levels of the confinement energy E z is much larger 
than the characteristic energy E xy of motion within the plane. As a result, if the system was initially 
prepared to be on the lowest, ground level of E z , , a “soft” motion along x and y cannot excite the system 
to higher levels of E Z P Hence, the system keeps the fixed quantum number n z = 1, through the motion, 
so that the confinement energy E z is constant and, according to Eq. (86), may be treated just as a fixed 
potential energy offset. 

The last conclusion is true even if the quantum well’s profile in direction z is not rectangular 
(provided that E z is still much larger than E XiV ). For example, many 2D quantum phenomena, such as the 
quantum Hall effect, 54 have been studied experimentally using electrons confined at semiconductor 
heterojunctions (e.g., epitaxial interfaces GaAs/Al x Gai_ x As) where the potential well in the direction 
perpendicular to the interface has a nearly triangular shape, with the splitting of energies E z is the order 
of 10' eV. 55 This splitting energy corresponds to k B T at temperature ~100 K, so that careful 
experimentation at liquid helium temperatures (4K and below) may keep the electrons performing 
purely 2D motion in the “lowest subband” ( n z =1). 




(b) 

a y 





GU 


x 


Fig. 1.8. Partial confinement in: (a) one dimension, and (b) two dimensions. 


Now, if a quantum well is formed in two dimensions (say, y and z, see Fig. 8b), 56 

\U(x), for 0 < v < a„ and 0 < z < a., 

U(r) = \ 7 y z (1.88) 

[ + oo, otherwize. 


then repeating the variable separation procedure we see that the 3D Schrodinger equation (68) may be 
satisfied with particular solutions of the type (71), again with sinusoidal standing waves Y(y) and Z(z), 
but generally a more complex function X(x), which has to satisfy the following ID Schrodinger equation 


D stationary 
Schrodinger 
equation 


h 2 d 2 X 
2 m dx 2 


+ U ei (x)X = E x X, 


(1.89) 


53 In the frequent case when motion in the [x, y ] plane is free (or almost free), the set of quantum states with the 
same quantum number n z is frequently called a subband, because their energies form a (quasi-) continuum of 
eigenenergies E xy . 

54 To be discussed in Sec. 3.2. 

55 See, e.g., P. Harrison, Quantum Wells, Wires, and Dots, 3 rd ed., Wiley, 2010. 

56 This is a reasonable first approximation, for example, for electron motion potential in so-called quantum wires, 
for example in the now-famous carbon nanotubes - see, e.g., the same monograph by P. Harrison. 
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with the effective potential energy 

U cf (x) = U(x) + E y +E : . 


(1.90) 


Again, if the particle stays in the lowest subband, n v = n z = 1 , both E y and E z retain their constant values 
E y \ and E z \. Repeating the above discussion of the one-dimensional partial confinement, we can expect 
that a wave mechanics problem may be substantially simplified if E y \ and E z \ are much larger than the 
energy scale E x of the motion in direction x. Namely, if: 


(i) the potential profile within the 2D partial confinement plane [v, z] is arbitrary (provided that it 
provides partial confinement scales a v and a z much smaller the spatial scale of the motion in direction x), 
and 

(ii) the potential energy U is either constant in time or changes relatively slowly, at a time scale r 
» h/E yz i (where E yz \ is the lowest eigenenergy of motion within the [y, z\ plane), 

then a large range of experiments may be adequately described by looking for solution of the general 
(time-dependent, 3D) Schrodinger equation in the form of the following product 


'¥(x,t)YZ l (y,z)exp\-i 



(1.91) 


where YZ\ is the lowest (ground-state) eigenfunction of the 2D problem in the [y, z] plane. Substituting 
this solution to the equation, and separating variables (y, z} from {x, t), we obtain the following time- 
dependent, ID equation 


dt 


fi 1 d 2x ¥(x,t ) 
2m dx 2 


+ U (x,Q'P(x, t) . 


(1.92) 


The next chapter will be devoted to a detailed discussion of the wave mechanics described by 
this ID equation, because it allows to study most basic phenomena and concepts of wave mechanics 
without involving overly complex math. In that chapter, for the notation simplicity, energy E x ID 
motion will be referred to just as E. However, one should always remember that each “ID problem” has 
two hidden degrees of freedom and that the genuine energy of the particle also includes a constant shift 
E yz i which is typically much larger than E x . The Universe is (at least :-) 3 -dimensional, and it shows! 


Finally, note that in systems with reduced dimensionality, Eq. (82) for the number of states at 
large k (i.e., for an essentially free particle motion) should be replaced accordingly: in a 2D system of 
area A » Wc , 

(1.93) 



while in a ID system of length / » 1 Ik, 

dN = ~^—dk, 
2 n 


(1.94) 


with the corresponding changes of the summation rule (83). This change has important implications for 
the density of states on the energy scale, dN/dE: it is straightforward (and hence left for the reader :-) to 
use Eqs. (82), (93), and (94) to show that for free 3D particles the density increases with E 


Effective 

potential 

energy 


1 D time- 
dependent 
Schrodinger 
equation 


2D number 
of states 


ID number 
of states 
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1/9 

(proportionally to E ), for free 2D particles it does not depend on energy, while for free ID particles it 

i I'j 

scales as E~ , i.e. decreases with energy. 


1.7. Exercise problems 

1.1 . The actual postulate made by N. Bohr in his original 1913 paper was not directly Eq. (10), 
but an assumption that at quantum leaps between adjacent large (quasiclassical) orbits with n » 1, 
hydrogen atom either emits or absorbs energy A E = fico, where co is its classical radiation frequency - 
according to classical electrodynamics, equal to the angular velocity of electron’s rotation. Prove that 
this postulate is indeed compatible with Eqs. (8)-(10). 

1.2 . Use Eq. (53) to prove that linear operators of quantum mechanics are commutative: 

/v/v/v/v /a a\a a /a a \ 

A 2 +A 1 = A l +A 2 , and associative: [A^ + A 2 )+ A 3 = A 1 + [A 2 +A 3 J. 


g( r). 


1.3 . Prove that for any Hamiltonian operator H and two arbitrary complex functions fir) and 

\f{r)Hg{rY V = j /7/'(r)g(r>/V . 


1.4 . Prove that the Schrodinger equation (1.25) with Hamiltonian (1.41) is Galilean-invariant, 
provided that the wave function is transfonned as 

where the prime sign denotes the variables measured in the reference frame O ’ that moves, without 
rotation, with a constant velocity v relatively to the “lab” frame O. Give a physical interpretation of this 
transfonnation. 


1.5 . Prove the so-called Hellmann-Feynman theorem: 51 




where A is some parameter, on whom the Hamiltonian H , and hence its eigenenergies E„ depend. 


1.6 . Calculate (x), (p x ), &c, and 8p x for eigenstate {n x , n y , n : ) of a rectangular, infinitely deep 
quantum well (69). Compare product SxSp x with Heisenberg’s uncertainty relation. 


57 Despite the theorem’s name, H. Hellmann (in 1937) and R. Feynman (in 1939) were not the first in the long list 
of physicists who have (apparently, independently) discovered this fact. Indeed, it may be traced back at least to a 
1 922 paper by W. Pauli, and was carefully proved by P. Giittinger in 1 93 1 . 
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1.7 . A particle, placed in a hard-wall, rectangular box with sides a x , a y , and a z , is in its ground 
state. Calculate the average force acting on each face of the box. Can the forces be characterized by a 
certain pressure? 


1.8 . A ID quantum particle was initially in the ground state of a very deep, rectangular quantum 
well of width a: 

fO, for-a/2 < x < +a/2, 

U(x) = 

+ oo, otherwise. 


At some instant, the well’s width is abruptly increased to value a’> a (leaving the well symmetric about 
point x = 0), and then left constant. Calculate the probability that after the change, the particle is still in 
the ground state of the system. 


1.9 . At t = 0, a ID particle of mass m is placed into a hard- wall, flat-bottom potential well 

f 0, for 0 < x < a, 

U(x) = \ ' 

[+ oo, otherwise, 

in a 50/50 linear superposition of the lowest (ground) and the first excited states, so that its 
wavefunction at that instant is 

T(x,0 ) = c\// g {x)+y/ e {x% 

where C is the nonnalization constant which ensures that the particle is (somewhere) in the well with 
probability W= 1. Calculate: 

(i) the normalized wavefunction v F(x, t) for arbitrary time t, and 

(ii) the time evolution of the expectation value (x) of particle’s coordinate. 

1.10 . Find the potential profile U(x) for which the following wavefunctions, 

(i) ¥ = c exp{- ax 2 - ibt}, and 

(ii) 'F = cexp{- n|x| - ibt}, 

(with real coefficients a > 0 and b), satisfy the Schrodinger equation for a particle with mass in. For each 
case, calculate (x), (p x ), 8x, and 8p x , and compare the product 8xSp x with Heisenberg’s uncertainty 
relation. 


1.11 . Calculate the energy density dN/dE of traveling wave states in large rectangular quantum 
wells of various dimensions: d= 1,2, and 3. 

1.12 . Use the finite difference method with steps all and a/3 to find as many eigenenergies as 
possible for a particle in the infinitely deep, hard-wall quantum well of width a. Compare the results 
with each other, with the exact formula. 58 


58 You may like to start from reading about the finite-difference method - see, e.g., CM Sec. 8.5 or EM Sec. 2.8. 
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Chapter 2. ID Wave Mechanics 

The main goal of this chapter is the solution and discussion of a few conceptually most important 
problems of wave mechanics for the simplest, ID case. This lowest dimensionality, and a wide use of 
potential profiles ’ approximation by sets of Dirac ’s delta-functions, simplify the necessary calculations 
considerably without sacrificing the physical essence of the described phenomena. The reader is 
advised to pay special attention to Sections 6-9, which cover some important material not usually 
discussed in textbooks. 


2.1. Probability current and uncertainty relations 


Schrodinger 

equation 


As was discussed in the end of Chapter 1, in several cases (most importantly, at strong 
confinement within the [y, z] plane), the general (3D) Schrodinger equation may be reduced to the ID 
equation (1.92): 


dt 


h 2 d 2x ¥(x,t) 
2m dx 2 


+ U(x,t) x ¥(x,t) . 


( 2 . 1 ) 


Probability 


If the transversal factor - say, the function YZ\ (y, z) that participates in Eq. (1.91), is normalized to 
unity, then the integration of Eq. (1.22a) over a segment [xi, X 2 ], gives the probability to find the 
particle on this segment: 



( 2 . 2 ) 


If the particle under analysis is definitely inside the system, the normalization of its ID wavefunction 
¥(*, t) is provided by extending integral (2) to the whole axis x: 


Normalization 


+00 

| w(x, t)dx = 1 , where w(x, t ) = +*( x , ?)+' * (x, t ) . 

—oo 


(2.3) 


A similar integration of Eq. (1.23) shows that the expectation value of any operator depending only on 
coordinate x (and possibly time), may be expressed as 


Expectation 

value 


It is also useful to introduce the probability current along the x-axis (a scalar ): 


Probability 

current 


I(x,t) = 

j x dydz = — Im 
m 

( * d ") 

T — T 

V dx J 

m ox 



(2.4) 


(2.5) 


Continuity 

equation 


where j x is x-component of the probability current density vector j(r,/). Then the continuity equation 
(1.48) for the segment [xi, X 2 ] takes the form 


dW 

dt 


+ /(x 2 ) - /(x j) = 0 . 


(2.6) 
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The above formulas are the basis for the analysis of ID problems of wave mechanics, but before 
proceeding to particular cases, let me deliver on my earlier promise to prove that Heisenberg’s 
uncertainty relation (1.35) is indeed valid for any wavefunction v F(x,£). For that, let us consider an 
evidently positive (or at least non-negative) integral 


44= j 


xY + A 


dY 

dx 


2 

dx> 0, 


(2.7) 


where A is an arbitrary real constant, and assume that at the at x — »±oo the wavefunction vanishes, 
together with its first derivative. The left-hand part of Eq. (7) may be recast as 


+00 

r 

„ a'F 

2 7 7( 



xY + A 

dx = 

x'F + A 

j 

—oo 

dx 

J 

-00 ' 

ax A 


= fx 2 ¥¥Vx + ;i fx[ + — ¥* dx + A 2 f 

J J ^ dx dx J J 

—00 —00 A s — cc 


ay 

x^ + A — 
dx 

\ 


dx 

W d x ¥* 
_ dx dx 
2 \ 


(2.8) 


dx. 


According to Eq. (4), the first term in the last form of Eq. (8) is just (x~). The second and the third 
integrals may be worked out by parts: 


Eh 




+ ¥ 

dx dx 


dx 


+00 rs / \ X=+CO / \ +00 

= j = J^(w*)= nV'“- J'PF*<fe = -l, (2.9) 


!?***-*= f 

J r)y r)\- * 


d x E * S'F * 

— d¥ = — — 'F 
dx dx 


dx dx 

As a result, Eq. (7) takes the following form: 


X = +CO 


X = -CO 


+00 ^5 2 \T/ 

J dx 2 


L7 

i 2 J 


dx = — 'F p^dx = 

n 7 ' x 


. ( 2 . 10 ) 


(/?) /* 2 fi lx 2 

j(A)= (x 2 ')- A + A 2 x > 0, i.e. A 2 + aA + b > 0, with a = - , 7 . and b = . \ 

' n Id 2 ) (pi 


. (2.11) 


This inequality should be valid for any real A, i.e. the corresponding quadratic equation, A 2 + aA + b = 0, 
can have either one (degenerate) real root - or no real roots at all. This is only possible if its determinant, 
Det = a -4b, is non-positive, leading to the following requirement: 

n 2 


x apI )>■ 


( 2 . 12 ) 


In particular, if (x) = 0 and ( p x ) = 0, 1 then according to Eq. (1.33), Eq. (12) takes the form 



which, according to the definition (1.34) of r.m.s. uncertainties, is equivalent to Eq. (1.35). 


(2.13) 


1 Eq. (13) may be proved even if (x) and (p x ) are not equal to zero, by making the following replacements, x — > x - 

(x), d/dx — > dldx + i{p)/fi, in Eq. (7), and then repeating all the calculations - which become rather bulky. We will 

re-derive the uncertainty relations, in a more efficient way, in Chapter 4. 


Heisenberg’s 

uncertainty 

relation 
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Now let us notice that the Heisenberg’s uncertainty relation looks very similar to the 
commutation relation between the corresponding operators: 

.3X1/ 

[x,/y ]T = (i/3 , - p x x) K Y = -ifix h ifi — (x v F) = ifkY . (2.14a) 

dx dx 


Coordinate/ 

momentum 

operators' 

commutator 


Since this relation is valid for arbitrary wavefunction v E(x, t), we may present it as an operator equality: 


[x, p x ] = iti * 0 . 


(2.14b) 


In Sec. 4.5 we will see that the relation between Eqs. (13) and (14) is just a particular case of a general 
relation between the expectation values of non-commuting operators and their commutators. 


2.2. Free particle: Wave packets 

Let us start our discussion of particular problems with free the ID motion, with U(x,t) = 0. From 
our discussion of Eq. (1.29) in Chapter 1, it is clear that in the ID case, a similar “fundamental” (i.e. a 
particular but the most important) solution of the Schrodinger equation (1) is a monochromatic wave 

v F 0 (jc,£) = const x e^ k ° x W()t \ (2.15) 

According to Eqs. (1.32), it corresponds to a particle with an exactly defined momentum 2 po = tiko and 
energy E 0 = hco 0 = ti k 0 12m. However, for this wavefunction, product V E* V F does not depend on either x 
or t, so that the particle is completely delocalized, i.e. its probability is spread all over axis x, at all 
times. (As a result, such state is still compatible with Heisenberg’s uncertainty relation (13), despite the 
exact value po of momentum p.) 

In order to describe a space-localized particle, let us form, at the initial moment of time it = 0), a 
wave packet of the type shown in Fig. 1.6, by multiplying the sinusoidal waveform (15) by some smooth 
envelope function A(x). As the most important particular example, consider a Gaussian packet 

Initial 
Gaussian 
wave 
packet 

(By the way, Fig. 1.6 shows exactly such a packet.) The pre-exponential factor in this envelope function 
has been selected in the way to have the initial probability density, 

w(x,0) = v F*(x,0) v F(x,0) = A*(x)A(x) = ) e xp|--|-J, (2.17) 

(2 n) 8x { 2{Sx) J 

normalized according to Eq. (3), for any parameters dx and ko 3 

In order to explore the evolution of this packet in time, we could try to solve Eq. (1) with the 
initial condition (16) directly, but in the spirit of the discussion in Sec. 1.5, it is easier to proceed 


v E(x,0 )=A(x)e lk ° X , 


with A(x ) = 


1 


(2;r) 1/4 (c7x) 


1/2 


expj- 


(2dx) 2 


(2.16) 


2 From this point on, in this chapter I will drop index x in notation for x-component of vectors k and p. 

3 This may be readily proven using the well-known integral of the Gaussian function (“bell curve”) given by Eq. 
(17) - see, e.g., MA Eq. (6.9b). It is also straightforward to use MA Eq. (6.9c) to prove that for wave packet (16), 
parameter dx is indeed the r.m.s. uncertainty ( 1 .34) of coordinate x, thus justifying its notation. 
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differently. Let us first present the initial wavefunction (16) as a sum (1.65) of eigenfunctions y/k{x) of 
the corresponding stationary ID Schrodinger equation (1.60), in our current case 


fi 2 d 2 y/ k 
2m dx 2 

that are simply monochromatic waves, 


2l2 


= E kV / k , with E k = 


h-k 

2 m 


¥ k = a k e 


ikx 


(2.18) 


(2.19) 


with a continuum spectrum of possible wave numbers k. For that, sum (1.65) should be replaced with an 
integral: 4 


v F(x,0) = J a k y/ k (x)dp =J a k e lkx dk . 


(2.20) 


Now let us notice that from the point of view of mathematics, Eq. (20) is just the usual Fourier 
transfonn from variable k to the “conjugate” variable x, and we can use the well-known formula of the 
reciprocal Fourier transfonn to calculate 


a k = 


^-jV(x,0 )e~ ibc dx=j 


1 


2 n (2 n) v \dx) 


v 1/2 


expj 


(2 5x) : 


- ikx\dx , where k = k -k 0 , (2.21) 


This Gaussian integral may be worked out by the following standard method. Let us complement the 
exponent to the full square of a linear combination of x and k, plus a tenn independent of x: 


(2 5x) 2 


- ikx = -- 


1 


(2 dx) 


2 L' 


x + 2 ik(Sxy -k(5c) 2 . 


(2.22) 


Since the integration in the right-hand part of Eq. (20) should be perfonned at constant k , in the infinite 
limits, its result would not change if we replace dx by dx’ = d[x + 2 i(Sx)~ k ]. 5 As a result, we get, 


1 


1 


2 n (2x) v \8x) 


v 1/2 


exp{- k 2 (dx) 2 }J exp I - >dx' = 


( 1 ^ 


1/2 


\2n j 


1 


{2n) l/ \Sk) 


0/2 


expj 


(2 5k) 2 


.(2.23) 


so that ak also has a Gaussian distribution, now along axis k, centered to value ko (Fig. 1.6b), with 
constant 5k defined as 


5k = 1/2 5x . 

Thus we may present the initial wave packet (16) as 
v F(x,0) = 


f 1 > 

1,2 > 

exp< 

r 

(N 

0 

1 

v2^G 

(2 7i) V4 (5k) 1/2 J 

L (25k) 2 J 


e lKx dk . 


(2.24) 


(2.25) 


From comparison of this formula with Eq. (16), it is evident that the r.m.s. uncertainty of the wave 
number k in this packet is indeed equal to 5k defined by Eq. (24), thus justifying the notation. The 


4 For notation’s brevity, from this point on the infinite limit signs will be dropped in all ID integrals. 

5 The fact that the argument shift is imaginary is not important, because function under the integral is analytical, 
and tends to zero at Re x ’ — > ±oo. 
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comparison of that relation with Eq. (1.35) shows that the Gaussian packet presents the ultimate case in 
which the product dxdp = 5x(h5k) has the lowest possible value (h/2); for any other envelope’s shape the 
uncertainty product may only be larger. We could of course get the same result for 5k from Eq. (16) 
using definitions (1.23), (1.33), and (1.34); the real advantage of Eq. (24) is that it can be readily 
generalized to t > 0. 

Indeed, we already know that the time evolution of the wavefunction is given by Eq. (1.67), for 
our case giving 6 


(2.26) 


Fig. 1 shows several snapshots of the real part of wavefunction (26), for a particular case 5k = 0.1 ko- 


Gaussian 
wave 
packet 
at arbitrary 
time 


, 1/2 


v F(x,Q = 


1 


\2a ) 


(2k) (5k) 


All 


\ ikx M 1 \ Jr 

Twr e t'^ 



The plots clearly show the following effects: 

(i) the wave packet as a whole (as characterized by its envelope) moves along the x axis with a 
certain group velocity v gr , 


6 Note that this packet is equivalent to Eq. (16) and hence is properly normalized to 1 - see Eq. (3). Elence the 
wave packet introduction offers a natural solution to the problem of infinite wave normalization, which was 
mentioned in Sec. 1.2. 
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(ii) the “carrier” wave inside the packet moves with a different, phase velocity v p h, which may be 
defined as the velocity the spatial points where wave’s phase cp(x, t) = arg'F takes a certain fixed value 
(say, cp = nil , where Re'F vanishes), and 

(iii) the packet’s spatial width gradually increases with time - the packet spreads. 

All these effects are common for waves of any physical nature. 7 Indeed, let us consider a ID 
wave packet of the type (26), 

Arbitrary 

(2.27) wave 
packet 

propagating in a media with an arbitrary (but smooth!) dispersion relation co(k), and assume that the 
wave number distribution ak is arbitrary but narrow: 5k « (k) = ko - see Fig. 1.6b. Then we may expand 
function oik) into the Taylor series near the central point ko, and keep only two of its leading terms: 


v F(x,?)= cot ^ dk , 


... drop 

calk) « co 0 h k + 

dk 



where k =k-k 0 , co 0 = co(k 0 ), 


(2.28) 


and both derivatives are also evaluated at point k = ko. In this approximation, 8 the expression in 
parentheses in the right-hand part of Eq. (27) may be rewritten as 


kx-cot = k 0 x + kx - 


dco ~ 

COrs H ii + 

° dk 


1 d 2 co 

lUk 2 


X 

t = (k 0 x - co 0 t)+ k 


f 

x - 


y 


V 


dco ) 
— t 

dk J 


1 d 2 co 

lUk 2 


k 2 t, (2.29) 


so that Eq. (27) is reduced to integral 




i(k 0 x-co Q t) 


f J. 


dco ' 

1 a k exp/ 1 

k 

x 1 


y dk ) 


1 d 2 co 
l^ik 2 


k 2 t 


\dk 


(2.30) 


First, let neglect the last term in square brackets (which is much smaller than the first term if the 
dispersion relation is smooth enough and/or the time interval t is sufficiently small), and compare the 
result with the initial form of the wave packet (27) 

v P(x,0 ) = j a k e ,kx dk = A(x]e' k ° X , with A(x)= ^a k e^ x dk . (2.31) 

The comparison shows that Eq. (30) is reduced to 

¥(*, t) = A(x - v e /)e lk ° (X ~ Vpht> , (2.32) 

where v gr and v p h are two constants with the dimension of velocity: 

Group 

(2.33) and phase 

velocities 

It is clear that Eq. (32) describes effects (i) and (ii) listed above. Let us calculate the group and 
phase velocities for the particular case of de Broglie waves whose dispersion law is given by Eq. (1.30): 



, co 

t-v andv -‘ s I 



7 See, e.g., brief discussions in CM Sec. 5.3 and EM Sec. 7.2. 

8 By the way, in the particular case of de Broglie wave described by dispersion relation (1.30), Eq. (28) is exact, 
because co= E/h is a quadratic function of k = p/h, and all higher derivatives of a> over k vanish for any k 0 . 
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co = 


fik 2 
2m 


_ dco | 

Vgr = Jp 


_ hk 0 _ co 

^~V = v °’ Vph= I 




^0 _ V gr 

2m 2 


(2.34) 


We see that (very fortunately!) the velocity of the wave packet envelope is constant and equals to that of 
the classical particle moving by inertia, in accordance with the correspondence principle. 

The remaining term in the square brackets of Eq. (30) describes effect (iii), the wave packet’s 
spread. It may be readily evaluated if the packet (27) is initially Gaussian, as in our example (25): 


a k = const x expj - 


(2 Skf J ‘ 


(2.34) 


In this case integral (30) is Gaussian, and may be worked out exactly as integral (20), i.e. merging the 
exponents under the integral, and presenting them as a full square of linear combination of x and k: 


k 2 ~ . i d 1 co 

+ ik (x - vt) - 

(2 5k) 2 gr 2 dk 2 


k 2 t = -A (t) 


k + i- 


*-y 

2A(0 


(*-v) 


4A (0 


■ + ik n x - 


i d co , 2 


2 dk 


k 2 t , (2.35) 


2 ,v o 


where I have introduced the following complex function of time: 

A(0 = 


1 i d 2 co , _ x2 i d 2 co 
t = (Ax) + 




2 


4 (dk) 2 2 dk 2 ' ' 2 dk 1 

and have used Eq. (24) in the second equality. Now integrating over k , we get 


(2.36) 


^(xA) oc exp< - 


0-y) 


4A(0 


• + / 


k n x - 


1 d CD 


2 dk 


kit 


2 0 


(2.37) 


The imaginary part of ratio 1/A(f) in the exponent gives just an additional contribution to wave’s phase, 
and does not affect the resulting probability distribution 


w(x, t) = '¥ 'T oc exp j - 


(*-y) 


-Re- 


1 


(2.38) 


2 A(0 

This is again a Gaussian bell curve spread over axis x, centered to point (x) = v gr t, with the r.m.s. width 


(Ac') 2 = \ Re 


1 


A(0 


= (Ac)" + 


C , ,2 X 2 


1 d'co 

2 HP 


-t 


1 


(Ac) 2 


(2.39a) 


• 2 2 

In the particular case of de Broglie waves, d~co!dk = film, so that 


(Ac')‘ = (Ac)~ + 


'fit'] 

2 1 

v 2 m. 

(At) 2- 


(2.39b) 


2 2 

The physics of the spreading is very simple: if d coldkr ^ 0, the group velocity dco/dk of each 
small group dk of monochromatic components of the wave packet is different, resulting in the gradual 
(eventually, linear) accumulation of the differences of the distances traveled by the groups. The most 
curious feature of Eq. (39) is that the packet width at t > 0 depends on its initial width Ac’(0) = Ac in a 
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non-monotonic way, tending to infinity at both 8x — > 0 and & — > qo. Because of that, for a fixed t, there 
is an optimal value of 8x with minimizes 8x 


( & 'L„ = V2(&) opt 


\l/2 

ht 

K m) 


(2.40) 


This expression may be used for spreading effect estimates. Due to the smallness of the Planck constant 
h on the human scale of things, for macroscopic bodies this effect is extremely small even for very long 

30 

time intervals; however, for light particles it may be very noticeable: for the electron (m = m e « Iff 
kg), and t= Is, Eq. (40) yields ( 8x ’) min ~ 1 cm! 

Note also that for any t ^ 0, the wave packet retains its Gaussian envelope, but the ultimate 
relation (24) is not satisfied, 8x’8p > fill - due to a gradually accumulated phase shift between the 
component monochromatic waves. The last remark on this topic: in quantum mechanics, the wave 
packet spreading is not an ubiquitous effect! For example, in Chapter 5 we will see that in a quantum 
oscillator, the spatial width of a Gaussian packet (for that system, called the Glauber state) does not 
grow monotonically but rather either stays constant or oscillates in time. 

Now let us briefly discuss the case when the initial wave packet is not Gaussian, but is described 
by an arbitrary initial wavefunction. In order to make the forthcoming result more appealing, it is 
beneficial to generalize out calculations to an arbitrary initial time to; it is evident that if U does not 
depend on time explicitly, it is sufficient to replace t with ( t - to) in all above formulas. With this 
replacement, Eq. (27) becomes 

T(x,0 = J a,A kx ~ ^ ~ ?0 , (2.41) 


and the reciprocal transform (21) reads 

a k = — f X ¥(x,t 0 )e~ ikx dx . (2.42) 

In J 

If we want to express these two formulas with one relation, i.e. plug Eq. (42) into Eq. (41), we 
should give the integration variable x some other name, e.g., jco. The result is 

'P(x,0 = — \dk f dx^{x 0 8o)e^ X ~ X ^ a[t ~^ ■ (2.43) 

In J J 


Changing the order of integration, this expression may be rewritten in the following general form: 


'F (x, t) = J G(x, f ; x 0 , f 0 ) (x 0 , t 0 )dx 0 , 


(2.44) 


where function G, usually called kernel in mathematics, in quantum mechanics is called the 
propagator . 9 According to Eq. (43), in our particular case of a free particle the propagator is equal to 


9 Its standard notation by letter G stems from the fact that the propagator is essentially the spatial-temporal 
Green’s function of Eq. (2.18), defined very similarly to Green’s functions of other ordinary and partial 
differential equations describing various physics systems - see, e.g., CM Sec. 4.1 and/or EM Sec. 2.7 and 7.3. 


ID 

propagator 

definition 
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G(x,t;x 0 ,t 0 ) = ^~ JV 


1 r i[k(x-x Q )-a)(t-t 0i 
2n 


l dk, 


(2.45) 


The physical sense of the propagator may be understood by considering the following special 
initial conditions: 10 


v F(x 0 ^o) = ^o - x '), (2.46) 

where x’ is a certain point within the domain of particle’s motion. In this particular case, Eq. (44) 
evidently gives 

^(xj) = G(x,t;x',t 0 ) . (2.47) 


Hence, the propagator, considered as a function of x and t only, is just the solution of the linear 
differential equation with ^-functional initial conditions. Thus while Eq. (41) may be understood as a 
mathematical expression of the linear superposition principle in the momentum (i.e., reciprocal) space 
domain, Eq. (44) is an expression of this principle in the direct space domain: the system’s “response” 
x V{x,t) to an arbitrary initial condition 'P(xo,G) is just a sum of its responses to its thin spatial “slices”, 
with propagator G(x,t; xo,to) representing the weight of each slice in the final sum. 


Calculating integral (45), one should remember that a> is not a constant but a function of k, given 
by the dispersion relation for particular waves. In particular, for the de Broglie waves 


G(x, t;x 0 ,t 0 ) = | exp 


k(x-x 0 )-^(t-t 0 ) 
2 m 


>dk 


(2.48) 


This is a Gaussian integral again, and may be readily calculated just it was done (twice) above, by 
completing the exponent to the full square. The result is 


Free 

particle’s 

propagator 


G(x,t;x 0 ,t 0 ) 


( V /2 

m 

y 2mh{t -t 0 ) 


expj - 


2 ' 

m(x-x 0 Y 
2ih(t -t 0 ) 


(2.49) 


Please note the following features of this complex function (plotted in Fig. 2): 



Fig. 2.2. Real (solid line) 
and imaginary (dashed line) 
parts of the ID free 
particle’s propagator. 


10 Note that this initial condition is not equivalent to a 5 - functional initial probability density (2). 
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(i) It depends only on differences (x - jco ) and (t - to). This is natural, because the free-particle 
propagation problem is uniform ( translation-invariant ) both in space and time. 

(ii) The function shape does not depend on its arguments - they just rescale the same function: 
its snapshot (Fig. 2), if plotted as a function of un-normalized x, just becomes broader and lower with 
time. It is curious that the spatial broadening scales as ( t - to) “ - just as at the classical diffusion, as a 
result of a deep analogy between quantum mechanics and classical statistics - to be discussed further in 
Chapter 7. 

(iii) In accordance with the uncertainty relation, the ultimately compressed wave packet (46) has 
an infinite width of momentum distribution, and the quasi-sinusoidal tails of the free-particle 
propagator, clearly visible in Fig. 2, are the results of the free propagation of the fastest (highest- 
momentum) components of that distribution, in both directions from the packet center. In the following 
sections, we will mostly focus on the spatial distribution of stationary, monochromatic wavefunctions 
(that, for unconfined motion, may be interpreted as wave packets of very large spatial width Sx), only 
rarely coming back to the wave packet discussion. Our excuse is the linear superposition principle, i.e. 
our conceptual ability to restore the general solution from that of monochromatic waves of all possible 
energies. However, the reader should not forget that, as the above discussion has illustrated, 
mathematically this restoration is not always trivial. 


2.3. Particle motion in simple potential profiles 

Now, let us proceed to the cases in which the potential energy U(x,t) is not identically equal to 
zero. The easiest case is that of spatially-uniform but time-dependent potential: U = U(t) = const. 
Indeed, the corresponding Schrodinger equation (1.25) with Hamiltonian 

H = ^- + U(t) = - — V 2 + U(t), (2.50) 

2 m 2 m 

allows the variable separation similar to that performed in Sec. 1.5, besides that the time-dependent 
function T(t) obeys an equation of motion that is slightly more general than Eq. (1.59): 

mf = [e -u(t)]r , 

whose solution may be expressed as an evident generalization of Eq. (1.61): 

r(0 = r(0) e +' + «’ ( ' ) l with w~ — and ^ = 

h dt fi 

Looking at the basic relations (1.22) and (1.23) of wave mechanics, it seems that this additional 
phase factor does not affect the particle probability distribution, or even any observable (including 
energy it is referred to the instant value of U), and hence the phase increment cp, associated with U(t), is 
just a mathematical artifact. This is certainly true for a single particle, however, the situation changes as 
soon as we recall that the Universe consists of more that one of them. 

For example, consider two similar, independent particles, each in the same (say, ground) 
eigenstate, but with the potential energies (and hence eigenenergies is) , 2 ) different by a constant A U = U\ 
- U 2 . Then, the difference (p = q>\ - (pi of their wavefunction phases evolves in time as 


(2.51) 

(2.52) 


Chapter 2 


Page 10 of 76 





Essential Graduate Physics 


QM: Quantum Mechanics 


Quantum 
phase 
difference’s 
evolution 

If the particles are in different worlds (or at least in different laboratories this evolution is 
unobservable; however, it should be intuitively clear that a very weak coupling of a certain detector to 
each particle may allow it to observe phase cp , while keeping the particle dynamics virtually 
unperturbed, i.e. Eq. (53) intact. 

Perhaps the most dramatic demonstration of this phenomenon is the Josephson effect in 
superconductors. 11 Experimentally, the easiest way to observe the effect is by connecting two bulk 
superconductor samples with a weak, short electric contact (called either the weak link or the Josephson 
junction) and bias them with a constant (dc) voltage V, typically in a few-microvolt range - see Fig. 3. 



(2.53) 



Fig. 2.3. Josephson effect in a weak link 
between two bulk superconductor electrodes. 


Superconductivity may be explained by a specific coupling between its conduction electrons, 
that leads, at low temperatures, to formation of the so-called Cooper pairs. Such pairs, each consisting 
of two electrons with opposite spins and momenta, behave as Bose particles, and form coherent Bose- 
Einstein condensate , n Most properties of such a condensate may be described by a single wavefunction, 
evolving in time as that of a free particle with the effective potential energy U= q</>= -2 e(f>, where (f) is 
the electrochemical potential, 13 and q = -2e is the total charge of the Cooper pair. As a result, for the 
situation shown in Fig. 3, Eq. (53) takes the form 


dcp _ 2e 
dt h 


(2.54) 


Josephson 

effect: where V= (j)\ - pi 

equations wca * ls a tunne ^ junction, electric current / of Cooper pairs through it should have a simple form: 14 


is the applied voltage. B. Josephson has predicted that, in a particular case when a 


1 = 1 . 


sin cp, 


(2.55) 


11 It was predicted theoretically by B. Josephson (then a graduate student!) in 1962 and observed experimentally 
in less than a year. More recently, analogs of this effect were also observed in superfluid helium and atomic Bose- 
Einstein condensates. 

12 See, e.g., SM Sec. 3.4. 

13 For more on this notion see, e.g. SM Sec. 6.4. 

14 Later, Eq. (55) has been shown to be valid for other weak link types as well, though deviations from have also 
been found. These deviations, however, do not affect the fundamental 2 ^-periodicity of function l(<p) - see, e.g., 
EM Sec. 6.4. As a result, no deviations from the fundamental relations (56)-(57) have been found (yet :-). 


Chapter 2 


Page 11 of 76 


Essential Graduate Physics 


QM: Quantum Mechanics 


where I c is some constant (scaling as the weak link strength). Combining Eqs. (53) and (54), we see that 
if the applied voltage is constant in time, the current oscillates with the so-called Josephson frequency 

fj = pr L , where ojj = ^~V , (2.56) 

2 7t n 


as high as ~ 484 MHz per each microvolt of applied dc voltage. This effect is now well documented, 
though a direct detection of the Josephson radiation is tricky; it is much easier to observe the phase 
locking (synchronization) 15 of the radiation by external microwave signal, which results in formation of 
nearly flat dc current steps at dc voltages 


- fico 
V n = n — 
2e 


(2.57) 


where co is the external signal frequency and n is an integer. 16 This effect is now being used in highly 
accurate standards of dc voltage. 17 

Now, let us move on to a discussion of the opposite case, when a ID particle modes in various 
potential profiles U(x) that are constant in time. Conceptually, the simplest of such profiles is a potential 
step - see Fig. 4. 



classical turning point 

As I am sure the reader knows, in classical mechanics, if a particle is incident on such a step (in 
Fig. 4, from the left), its kinetic energy p~/2m cannot be negative, so that it can only travel through the 
classically accessible region where its (conserved) full energy, 

2 

E = ^- + U(x), (2.58) 

2 m 

is larger than the local value U{x). Let the initial velocity v = phn be positive, i.e. directed toward the 
step. Before it has reached the classical turning point x c , defined by equation 

U(x c ) = E , (2.59) 


15 See, e.g., CM Sec. 4.4. 

16 If co is not too high, this effect may be adequately described combining Eqs. (54)-(55). Let me leave this task 
for the reader. 

17 The most precise proof that the Josephson frequency-to-voltage ratio f/V does not depend on superconducting 
material (to at least 15 decimal places!) has been carried out by the group led by J. Lukens here at Stony Brook - 
see J.-S. Tsai et al., Phys. Rev. Lett. 51 , 316 (1983). 
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2 

kinetic energy p /2 m never turns to zero, so that the particle continues to move in the initial direction. 
On the other hand, the particle cannot penetrate that classically forbidden region x > x c , because there 
its kinetic energy would be negative there. At the point x = x c , particle’s velocity changes sign, i.e. it is 
reflected back from the classical turning point. 

In order to see what the wave mechanics says about this situation, let us start from the simplest, 
sharp potential step shown with bold black lines in Fig. 5: 


U (x) = U o 0(x ) = 



at x < 0, 
at 0 < x. 


(2.60) 


For this choice, and any energy within the interval 0 < E < C/ 0 , the classical turning point is x c = 0. 



Fig. 2.5. Reflection of a 
monochromatic wave from a potential 
step U 0 > E. (This particular 
wavefunction’s shape is for U 0 = 5 E.) 
The wavefiinction is plotted with the 
same schematic vertical offset by E, as 
those in Fig. 1.7. 


Let us represent an incident particle with a wave packet so long that the spread 8k ~ \!8x of its 
wave number spectrum, and hence the energy uncertainty 8E = hSco = ti(dco/dk)Sk is negligible in 
comparison with its average value E < Uo, as well as with ( Uq - E). In this case, E may be considered a 
given constant, and the time dependence of the solution is given by Eq. (1.61), and we can limit 
ourselves to the solution of the ID version of the stationary Schrodinger equation (1.63), in this case 

~^^X + U(x) ¥ = E ¥ , (2.61) 

2 in dx 


for the spatial part i/Ax) of the wavefunction. 18 


At x < 0, i.e. at U = 0, the equation is reduced to the Helmholtz equation (1.75), and may be 
satisfied with two traveling waves, proportional to cxp{+/'kx} and exp { -ikx} correspondingly, with k 
satisfying the dispersion equation (1.30): 



(2.62) 


Thus the general solution of Eq. (61) in this region may be presented as 


18 Note that this is not the eigenproblem like the one we have solved in Sec. 1.4 for a quantum well. Indeed, now 
energy E is considered fixed - e.g., by the initial conditions that launch a long wave packet upon the potential 
step, from the left. 
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¥ (x)= Ae +ikx + Be~ ila . 


(2.63) 


The second term in the right-hand part evidently describes an (infinitely long) wave packet traveling to 
the left, which represents particle’s reflection from the potential step. If B = -A, this solution is reduced 
to Eq. (1.76) for the potential well with infinitely high walls, but as we will see in a minute, for our 
current case of finite step height Uo, the relation between coefficients B and A may be different. 


Incident 

and 

reflected 

waves 


To show this, let us solve Eq. (61) for x > 0, where U = Uo > E. In this region the equation may 
be rewritten as 


dx 2 


= **¥+, 


(2.64) 


where /c is a real constant defined by the relation similar to Eq. (62): 


K ^2m(U -E) >o 
h 2 


(2.65) 


The general solution of Eq. (64) is the sum of exp{+xx} and exp {-kx}, with arbitrary coefficients. 
However, the wavefunction should be finite at x — » oo, so only the latter exponent is acceptable: 


y / + {x) = Ce KX . 


( 2 . 66 ) 


Decaying 
wave in 
classically 
forbidden 
region 


This penetration of the wavefunction into the classically forbidden region, and hence a finite 
probability to find the particle there, is one of the most fascinating predictions of quantum mechanics, 
and has been repeatedly observed in experiment, e.g., via tunneling experiments - see below. From Eq. 
(66), it is evident that the constant k, defined by Eqs. (65), may be interpreted as the reciprocal 
penetration depth. Even for the lightest particles this depth is usually very small. Indeed, for E « Uo 
that equation yields 


<7 = 


£=0 

K 


ft 

(^r • 


(2.67) 


For example, for a conduction electron in a typical metal, that runs, at its surface, into a sharp potential 
step Uo, whose height equals to metal’s workfunction W « 5 eV (see the discussion of the photoelectric 
effect in Sec. 1.1), 8 is close to 0.1 nm, i.e. is close to a typical size of an atom. For heavier elementary 
particles (e.g., protons) the penetration depth is correspondingly lower, and for macroscopic bodies it is 
hardly measurable. 

Returning to our problem, we still should find coefficients A, B, and C from the boundary 

2 2 

conditions atx = 0. Since if is a finite constant, and U(x) is a finite function, Eq. (61) says that d y/!dx 
should be finite as well. This means that the first derivative should be continuous: 


lim 


£■-» 0 


dy/ | 
dx 


X=+£ ' 


dy/ | 
dx 


X=—£ 


= lim 


r d 2 y/ , 2m 


>0 


dx 2 


j x = _ lim ^0 


Tt, 

^\U (x) - E\y/ dx =0. (2.68) 


Repeating such calculation for function y/{x) itself, we see that it also should be continuous at all points, 
including x = 0, so that 
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MO) = ^ + (0), 



d'/C 

dx 


( 0 ). 


(2.69) 


Plugging solutions (63) and (66) into these two boundary conditions, we get a system of two linear 
equations 

A + B = C, ikA - ikB = -kC, (2.70) 


whose (elementary) solution enables us to express B and C via A : 


B = A 


k-itc 
k + i/c ’ 


C = A 


2k 

k + ix 


(2.71) 


We immediately see that since the nominator and denominator in the first of these formulas have 
equal moduli, so that \B\ = \A\. This means that, as we could expect, a particle with energy E < Uo is 
totally reflected from the step. As a result, at x < 0 our solution (63) may be presented by a standing 
wave 


y/_ = 2 iAe 1 ^ sin(kx - 9), 


with 9 = arctan— . 

K 


(2.72) 


Notice that the shift Ar = 9/k = (arctan klx)lk of the standing wave to the right, due to the partial 
penetration of the wavefunction under the potential step, is commensurate with, but generally not equal 
to 8= 1 Ik. Figure 5 shows the full behavior of the wavefunction, for a particular case E = UcJ 5 , at which 
khc= [E/(U (r E)]' ,2 = 1/2. 

According to Eq. (65), as the particle’s energy E is increased to approach Uo, the penetration 
depth \!k diverges. This raises an important issue: what happens at E > Uo, i.e. if there is no classically 
forbidden region in the problem? Again, in classical mechanics the incident particle would continue to 
move to the right, though with a reduced velocity, corresponding to the new kinetic energy E - Uo, so 
there would be no reflection. In quantum mechanics, however, the situation is different. In order to 
analyze it, it is not necessary to re-solve the whole problem; it is sufficient to note that all our 
calculations, and hence Eqs. (71) are still valid if we take 19 


k = -ik'. 


with 


k „ s 2 m(E-U < ,) >g . 


With this replacement, Eq. (71) becomes 20 


B = A 


k-k’ 
k + k r 


C = A 


2k 

k + k’ 


(2.73) 


(2.74) 


The most important result of this change is that now the reflection is not complete: \B\ < \A\. In 
order to evaluate this effect qualitatively, it is more fair to use not the B/A or C/A ratios, but rather that 


19 Our earlier discarding of the particular solution exp{ya}, now becoming exp {-ik’x}, is still valid, but now on a 
different grounds: this term would describe a wave packet incident on the potential step from the right, and this is 
not the problem under our consideration. 

20 These formulas are completely similar to those for the partial reflection of classical waves from a sharp 
interface between two uniform media, at normal incidence (see, e.g., CM Sec. 5.4 and EM Sec. 7.4), with the 
effective impedance Z of de Broglie waves proportional to their wave number k. 
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of the probability currents (5) corresponding to traveling waves with amplitudes C and A, in the 
corresponding regions (respectively, x > 0 andx < 0): 

Potential 

(2.75) step’s 

transmission 


(T so defined is called the transparency of the inhomogeneity, in our current case of the potential step.) 
The result given by Eq. (75) is plotted in Fig. 6a. Notice its most important features: 

(i) At U 0 = 0, the transparency is full, T= 1 — naturally, for having no step at all. 

(ii) At U 0 — » E, the transparency tends to zero - giving a proper connection with the case E < Uo. 

(iii) We can use result (75) even for Uo < 0, i.e. for the step-down (or “cliff’) profile - see Fig. 
6b. Very counter-intuitively, the particle is (partly) reflected even from such a cliff, and the transmission 
diminishes (rather slowly) at Uo — » -oo. 


T -Ic_ k '\C 

| 4 kk' _ 4 [E(E-U 0 )] 12 

I A k\A 

2 ( k + k') 2 

E V1 + (E-U 0 ) 112 

2 * 


(a) 



A 

E> 0 


£7 = 0 




B* 0 



► 


(b) 


U 


o 


Fig. 2.6. (a) Transparency of a potential step with U 0 < E 
as a function of its height, according to Eq. (75), and (b) 
the potential profile at U 0 < 0. 


The most important conceptual conclusion of our analysis is that the quantum particle is partly 
reflected from a potential step with Uo < E, in the sense that there is a nonvanishing probability T < 1 to 
find it passed over the step, while there is also probability (1 - T) to have it reflected. 

The same property is exhibited, for any relation between E and Uo, by another simple potential 
profile U(x), the famous tunnel barrier. Figure 7 shows its simple, “rectangular” version: 


U(x) = 


0 , 

U { 

0, 


for x < -d / 2, 
2 < x < +d / 
for + d / 2 < x. 


U 0 , for - d / 2 < x < +d / 2, 


(2.76) 


E 

U 


u = u n 


A 

C 

F 




◄ 
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= 0 





> 


■d/2 


+ dl 2 


Fig. 2.7. Rectangular tunnel barrier. 
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Rectangular 

tunnel 

barrier’s 

transparency 


In order to analyze this problem, it is sufficient to look for the solution to the Schrodinger 
equation in the form (63) at x < -d/2. At x > +d/2, i.e., behind the barrier, we may use the arguments 
presented above (no wave packet source on the right!) to keep just one traveling wave, 

>/s + (x) = Fe lkx . (2.77) 

However, under the barrier, i.e. at -<7/2 < x < +<7/2, we should generally keep both exponential terms, 

y/ b (x) = Ce~ KX +De +KX , (2.78) 

because our previous argument, used in the potential step problem’s solution, is no longer valid. (Here k 
and k are still defined, respectively, by Eqs. (62) and (65).) In order to find the relation between 
coefficients A, B, C, D, and F, we need to plug in the solutions into the boundary conditions similar to 
Eqs. (69), but now at two boundary points, x = ± d/2. 

Solving the resulting system of 4 linear equations for five amplitudes (A, B, C, D, and F), we can 
readily calculate four ratios B/A, C/A, etc., in particular, 


F 

~A 


ex 


p{- ikd] 


cosh «7 + — 
2 


— - — |sinh/c<7 

yk k) 


(2.79a) 


and hence barrier’s transparency 


(2.79b) 


Figure 8a shows the transparency as a function of particle energy E, for several characteristic 
values of the barrier thickness <7, or rather of the ratio <7/ 8, where (/is defined by Eq. (67). 


T — 

F 

2 

cosh" xd + 

fN 

1 

<N 

2 

sinh 2 xd 
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2 fdc ) 




(a) (b) 




Fig. 2.8. Transparency of the rectangular tunnel barrier as a function of particle’s energy E. 
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The plots show that for a thin barrier (d < 8) the transparency grows gradually with particle’s 
energy. This growth is natural, because the penetration constant k decreases with the growth of E, i.e., 
the wavefunction penetrates more and more into the barrier, so that more and more of it is “picked up” 
at the second interface (x = +d!2) and transferred into the wave Fcxp { ikx } propagating behind the 
barrier. As Eq. (79b) shows, for thick barriers (d» 8) , this dependence is dominated by an exponent, 


T « 

( 4krc 'l 

2 

-2 Kd 


Kk 2 +K 2 J 

£ > 


(2.80) 


Thick 

tunnel 

barrier’s 

transparency 


that may be clearly seen as a straight segments in semi-log plots (Fig. 8b) of T as a function of the 

1/9 

combination (1 - E/Uq) “ which is proportional to k- see Eq. (65). 


Equation (80) also clearly shows the exponential dependence of the barrier transparency of its 
thickness at d » 8 . This dependence is the most important factor for various applications of the 
quantum-mechanical tunneling - from the field emission 21 of electrons to scanning tunneling 
microscopy. 22 Also noted should be substantial negative implications of the effect for modern electronic 
engineering, most importantly imposing a limit for scaling down of field effect transistors in 
semiconductor integrated circuits (and hence the circuit density increase according to the well-known 
Moore’s law), due to increase of tunneling both through the gate oxide and along transistor’s channel. 23 


Another interesting effect visible in Fig. 8a (for case d = 0.38) are the oscillations of T at E > Uo. 
This is our first glimpse at one more interesting quantum effect: resonant tunneling. I will discuss this 
effect in detail in Sec. 5 below. 


2.4, The WKB approximation 

Before moving on to exploring more complex potentials, let us see whether the results discussed 
in the previous section hold on in the opposite limit of so-called soft, gradual potential profiles, like that 
sketched in Fig. 4. (The quantitative conditions of the “softness” will be derived below). The most 
efficient analytical tool in this limit is the WKB (or “quasiclassical”) approximation developed by H. 
Jeffrey, G. Wentzel, A. Kramers, and L. Brillouin in 1926-27. 

In order to derive its ID version, let us rewrite the Schrodinger equation (61) as 

7 2 

-y- + k 2 (x)ip = 0 (2.81) 

dx 

where the local value of wave number k{x) is defined similarly to Eq. (73), 

Local 

(2.82) wave 

number 

but now it may be a function of x. We already know that for k(x) = const, the fundamental solutions of 
this equation have form Aexp{+zAx} and Bcxp \-ikx \ . Any of them may be presented in a simple form 


k ^ x)s Mp-mxj\, 

■fc ^ 


21 See, e.g., G. Furscy, Field Emission in Vacuum Microelectronics, Kluwer, New York, 2005. 

22 See, e.g., G. Binning and H. Rohrer, Helv. Phys. Acta 55, 726 (1982). 

23 See, e.g., V. Sverdlov et al., IEEE Trans, on Electron Devices 50, 1926 (2003). 
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y/{x ) = e 


/O(x) 


(2.83) 


where ®(x) is a complex function, in this simplest case equal to either (kx - i\nA) or (- kx - i\nB). This is 
why we may try use Eq. (83) to look for solution of Eq. (81) even in the general case, k(x) ^ const. 
Differentiating Eq. (83) twice, we get 


d y/ . t/® ;<i) 

ll 

^ - 
<N 

.d 2 0 

1 r-- 

(d<f> V 

dx dx 

dx 

dx 

v dx J 


(2.84) 


Plugging the last expression into Eq. (81) and requiring the factor before exp J/®(x)} to vanish, we get 


. d 2 ® 

dx 2 


v dx , 


+ k\x) = 0 . 


(2.85) 


This is still an exact, general result. At the first sight, it looks worse than the initial equation 
(81), because Eq. (85) is nonlinear. However, it is more ready for simplification in the limit when the 
potential profile is very smooth, dU/dx — > 0. Indeed, we know that for a unifonn potential, <D” = 0. 
Hence, in the “0 th ” approximation, ®(x) — > ®o(v), we may try to keep that result, so that Eq. (85) yields 


'dO pV 

v dx j 


= k\x) . 


(2.86a) 


Just as in the uniform case, this equation has two roots, 


d®o 

dx 


= ±k(x ) , 


(2.86b) 


so that its general solution is 


i// 0 (x) = A expj + i j k(x')dx'\ + B expj - /j k{x')dx' 


(2.87) 


where x ’ is the lower limits of integration affect only constants A and B. The physical sense of this result 
is simple: it is a sum of forward- and back-propagating waves, with the coordinate-dependent local wave 
number k(x) that self-adjusts to the potential profile. 

Let me emphasize the non-trivial nature of this approximation. 24 First, any attempt to address the 
problem with a standard perturbation approach (say, y/ = y/o + y/\ +..., with y/ n proportional to n th power 
of some small parameter, 25 in this case scaling d 2 Uld 2 x) would fail for most potentials, because even a 
slight but persisting deviation of U(x ) from a constant leads to a gradual accumulation of phase ®o, 
impossible to describe by any small perturbation of y/. Second, the dropping of term d~d>tdx~ in Eq. (85) 
is not too easy to justify. Indeed, since we are committed to the “soft potential limit” dU/dx — > 0, we 
should be ready to assume the characteristic length a of spatial variation of ® to be large, and neglect 


24 Philosophically, this space-domain method is very close to the time-domain rotating wave approximation 
(RWA) used, for example, in the classical and quantum theory of oscillations - see, e.g., CM Secs. 4.2-4. 5, and 
Secs. 6.5, 7.6, 7.7, 9.2, and 9.4 of this course. 

25 Such perturbation theories will be discussed in Chapter 6. 
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the terms that are the smallest ones in the limit a — > oo. However, both first terms in Eq. (85) are 
apparently of the same order in a, namely 0(a ); why have we neglected just one of them? 

The price we have paid for such a “sloppy” treatment is high: Eq. (87) does not satisfy the 
fundamental property of the Schrodinger equation, the probability current conservation. Indeed, since 
Eq. (81) describes a fixed-energy (stationary) spatial part of the general Schrodinger equation, its 
probability density w = V F V P* =i//i// 1 ', and should not depend on time. Hence, according to Eq. (6), we 
should have 7(x) = const. However, this is not true for each component of Eq. (87); for example for the 
forward-propagating component of its right-hand part, Eq. (5) yields 

I Q (x) = —\A\ 2 k(x), (2.88) 

m 


evidently not a constant if k(x ) ^ const. 

The brilliance of the WKB theory is that the problem may be fixed without revising the 0 th 
approximation. Indeed, let us explore the next, 1 st approximation instead: 

®(x)^® WKB (x) = ® 0 (x) + ® 1 (x), (2.89) 

where ®o still obeys Eq. (85), while ®i describes a small correction to the 0 th approximation, in the 
following sense: 26 


dd> 1 

dx 


« 


dO 0 

dx 


= k(x) . 


(2.90) 


Plugging Eq. (89) into Eq. (85), with the account of the definition (86), we get 


r d 2 P c 

dx 2 


- + - 


d 2 O f 


dx 2 


<7®! 


dx 


dO n of®, 

- + - 


dx dx 


= 0 . 


(2.91) 


2 2 2 2 

Using condition (90), we may neglect d~0\/dx~ in comparison with d ®o /dx in the first parenthesis, and 
dd>\ldx in comparison with 2<7® 0 /<7x in the second parenthesis. As a result, we get the following 
approximate result: 


<7®, 


i <7 2 ® n dQ>, 


dx 2 dx“ dx 2 dx 


i d (. dO 

In 


v 


dx 


= ~ — [in *(*)] = i — [in k [ ' 2 (x)] , 
2 dx dx 


(2.92) 


(2.93) 


(2.94) 


(Again, the lower integration limit is arbitrary, but its choice may be incorporated into complex 
constants a and b.) This modification of the 0 th approximation (87) overcomes the problem of current 
continuity; for example, for the forward-propagating wave, Eq. (5) gives 


x j 

? '®| wkb = ®o + /0 i = ±*' J k(x')dx' + In ^ 1/2 , 


V WKB (•*-) 


k l,2 (x) 


expj /| k{x')dx'\ 


+ - 


b 


k l,2 (x) 


expj-i|A:(x')Jx'L forA: 2 (x)>0. 


26 For certainty, I will use the discretion given by Eq. (82) to define k(x) as the positive root of its right-hand part. 


WKB 

wave- 

function 
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WKB 

probability 

current 


/ 


WKB 


(*) = 


h 

m 


i i 2 

rl 


= const . 


(2.95) 


1 /9 . 

Physically, factor k “ in the denominator of the WKB wavefunction’s pre-exponent is easy to 
understand. The smaller the local group velocity (34) of the wave packet, v gr (x) = hk(x)/m, the “easier” 
(more probable) it should be to find the particle within a certain interval dx. This is exactly the result 
that WKB gives: dW/dx = vfix) = t//t// |! oc \/k(x) oc l/v gr . 


WKB: 
first 
condition 
of validity 


Another value of the 1 st approximation is a clarification of WKB theory’s validity condition: it is 
given by Eq. (90). Plugging into this relation the first form of Eq. (92), and estimating |<D 0 ”| as |®o V«, 
where a is the spatial scale of a substantial change of |<D 0 1 = k(x), we can rewrite the condition as 


ka » 1 . 


(2.96) 


In plain English, this means that the region where U(x), and hence k(x), change substantially should 
contain many de Broglie wavelengths A = 2 jdk. 


9 

So far I have implied that k (x) oc E - U{x ) is positive, i.e. particle moves in the classically 
accessible region. Now let us extend the WKB approximation to the situation where the difference E - 
U(x) may change sign, for example to the reflection problem sketched in Fig. 4. Just as we did for the 
sharp potential step, we first need to find the appropriate solution for the classically forbidden region, in 
this case x > x c . For that, there is no need to redo our calculations, because they are still valid if we, just 
as in the sharp step problem, take /fix) = //fix), where 




for x > x„ 


(2.97) 


and keep just one of two possible solutions (with k> 0), in analogy with Eq. (66). The result is 


W WKB (a) — 



X 


J tc(x')dx' L 


fork 2 < 0, i.e./c 2 > 0, 


(2.98) 


with the lower limit at some point with rf > 0 as well. This is a really wonderful formula! It describes 
the quantum-mechanical penetration of the particle into the classically forbidden region, and provides a 
natural generalization of Eq. (66) - leaving intact, of course, our estimates of the depth 8 ~ \hc of such 
penetration. 

Now we have to do what we have done for the sharp-step problem in Sec. 2: use the boundary 
conditions in the interface point x = x c to relate constants a, b, and c. However, now this operation is a 
tad more complex, because both WKB functions (94) and (98) diverge, albeit weakly, at the classical 
turning point, were both k(x) and /fix) tend to zero. This connection problem may be however, solved in 
the following way. 27 Let us use the commitment of potential “softness”, assuming that it allows us to 
keep just two leading terms in the Taylor expansion of function U{x) at point x c : 


U(x)»U (x ) + 


dU 

dx 


(x — X c ) = E + ~~~ | x=x (x — x c ) . 
dx 


(2.99) 


27 An alternative way to solve the connection problem, without involving the Airy functions but using an 
analytical extension of WKB formulas to the plane of complex argument, may be found, e.g., in Sec. 47 of 
textbook by L. Landau and E. Lifshitz, Quantum Mechanics, Non-Relativistic Theory, 3rd ed. Pergamon, 1977. 
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Using this truncated expansion, and introducing a dimensionless variable for coordinate’s deviation 
from the classical turning point, 




x-x„ 


x n 


r 

2 m(dU / tlx) 


,1/3 


( 2 . 100 ) 


we reduce the Schrodinger equation (61) to the simple Airy equation 


d V 


fr = o. 


( 2 . 101 ) 


Airy 

equation 


As for all linear, ordinary differential equations of the second order, the general solution of Eq. (101) 
may be presented as a linear combination of two fundamental solutions, in this case called Airy 
functions and Bi(^), shown in Fig. 9a. 


(a) 



(b) 





Fig. 2.9. (a) Airy functions Ai and Bi, and (b) the WKB approximation for function Ai(£). 


The latter function diverges at f — > oo, and thus is not suitable for our current problem (Fig. 4), 
while the fonner function has the following asymptotic behaviors at \ f\ » 1 : 28 


Ai(0 


x 

_l/2| / '| 1/ 4 

n M 

sim 


L 


1 J 

-exp^--C 


3/2 


for f — > +oo, 


( 2 . 102 ) 


lj(-A'+U, tor £ —> -<c. 


Now let us apply the WKB approximation to the Airy equation (101). Taking the classical 
turning point (f=0) for the lower limit, for f > 0 we get (in dimensionless units) 


28 The following (exact!) integral formulas, 

{? A 

I cor 1 

n 


1 00 ( Z"3 h 1 00 

Ai(0 = -fcos V + ^ d *> Bi (0 = -f 
^ J l 3 J tv J 0 


expj-y + 4? [sin 


^3 




df 


are often convenient for practical calculation of Airy functions at intermediate values of the argument, |41 ~ 1. 
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k\C) = C, k{Q = C 12 , jfc(C)dC = ^C 3 ' 2 , 


(2.103) 


i.e. exactly the exponent in the first line of Eq. (102). Making a similar calculation for f < 0, with the 
natural assumption \b\ = \a\ (full reflection from the potential step), we arrive at the following result: 


AiwKB^) - , n/4 


x < 


kr 


c ex P j - — 


3/2 


for Q > 0, 


asinj— (-<^) 3/ ' + (p\, for4"<0. 


(2.104) 


This approximation differs from the exact solution at small values of f , i.e. close to the classical 
turning point - see Fig. 9b. However, at |£j » 1, Eqs. (104) describe the Airy function exactly if 

(p = — and c = — . (2.105) 

4 2 


Hence we can use these connection formulas to express the relations between coefficients a, b, and c of 
the general WKB solutions (94) and (98). In particular, the first of them yields b = -a exp{49r/2}, so that 
Eq. (94) becomes 


V WKB ( X < X c ) 


a' 

~k^{x) 


exp 



x c 


X 

exp \-i^k{x')dx’ 

{ X c 



(2.106) 


This result may be also described by a simple mnemonic rule: reflecting from a “soft” potential step, the 
wavefunction acquires an additional phase shift A cp = nl 2, if compared with the reflection from a “hard” 
(vertical) potential wall located atx = x c , for which, according to Eq. (1.76), we would have b = -a. 

Let us quantify the condition of validity of the connection formulas (105) - in other words, the 
criterion of the step “softness”. For that, within the region where the WKB approximation differs from 
for the exact Airy equation (\f\ ~ 1, i.e. |x - x c \ ~ xo), the deviation from the linear approximation (99) of 
the potential profile should be relatively small. This deviation may be estimated using the next term of 
the Taylor expansion, d 2 U/d 2 x\ x - Xc (x -x c ) 2 /2. As a result, the softness condition may be expressed as 
I d Uktx | x =xcX o « I dU/dx | x=xc . With the account of Eq. (100) for xo, the condition becomes 

WKB: 
second 
condition 
of validity 

As an example of a very useful application of the WKB approximation, let us use it to calculate 
the energy spectrum of ID particle in a soft ID quantum well (Fig. 10). As was discussed above, we 
may always consider the standing wave describing an eigenstate !//„ (corresponding to eigenenergy E n ) 
as a traveling wave going back and forth between the walls, being sequentially reflected by each of 
them. Let us apply the WKB approximation to such a traveling wave. First, according to Eq. (94), 
propagating from the left classical turning point xl to the right point xr, it acquires phase change 

X R 

A <p^ = Jk(x)d'x. (2.108) 


d 2 U 


dx~ 
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2m 


dU 

dx 


J x=x 


(2.107) 
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At the reflection from the soft wall at xr, according to the connection formula (106), the wave 
acquires an additional shift nt2. Now, traveling back from xr to xl the wave gets a shift similar to one 
given by Eq. (108): Ar/y_ = A cp^. Finally, at the reflection from xl it gets one more nil. Summing up all 
these contributions, we may write the self-consistency condition (that the wavefunction “catches its own 
tail with its teeth”), in the form 

71 7T Xr 

A^totai = A< P^ +J + A$v + y = 2 J k(x)dx + 7t = 2m, with n = 1, 2 ,... (2.109) 


Rewriting this result in terms of particle’s momentum p{x ) = lik(x), we arrive at the famous ID Bohr- 
Sommerfeld quantization rule 



( 2 . 110 ) 


Bohr- 

Sommerfeld 

quantization 

rule 


where the closed path C means the full period of classical motion. 29 



Fig. 2.10. Quasiclassical treatment of eigenstates in a 
soft ID potential well. 


Let us see what does this rule give for the very important particular case of a quadratic potential 
profile of a harmonic oscillator of frequency op. In this case, 

if] 

U(x) = —g>qX 2 , (2.111) 

and the classical turning points are the roots of a simple equation 


m 2 2 _ r 

2 X c E n 5 


( 2 . 112 ) 


i I'j 

so that xr = x,i = (2 E n lm) lap > 0, xl = - x„ < 0. Due to potential’s symmetry, the integration required by 


Eq. (1 10) is also simple: 

X R +X C 


X R +X c +X c / 2y 12 

J p(x)dx = J {2 m[E n - U (x)]} 1 dx = (2 mE n )' J 1 - dx = (2 mE n ) 

X, -X c —X. V x n J 


xu 2 7V 2 E n 71 


x„ — = 


" 2 


(D 0 2 


(2.113) 


29 Note that at motion in more than one dimension, a closed classical trajectory may have no turning points. In this 
case, the constant 'A in the parentheses of Eq. (109), arising from the turns, should be dropped. The simplest 
example is the circular motion of the electron about the proton in Bohr’s picture of the hydrogen atom, for which 
the modified quantization (109) condition takes form (1.10) postulated by N. Bohr. (A similar relation for the 
radial motion is sometimes called the Sommerfeld-Wilson quantization rule.) 
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so that Eq. (1 10) is satisfied if 

Harmonic 
oscillator’s 
energy 
levels 



(2.114) 


In order to estimate the validity of this result, we have to check condition (96) at all points of the 
classically allowed region, and Eq. (107) at the turning points. A straightforward calculation shows that 
both conditions are valid for n » 1 . However, we will see below that Eq. (114) is actually exactly 
correct for all energy levels - thanks to special properties of potential profile (111). 

Now, let us look at the second of connection formulas (105), c = all. Again, it differs from the 
result (71) for a sharp potential step, that may be rewritten as 


C = A 


2k 


k + i/c 


= A 


[\ + (K/k) 2 ] 


i^exp {- 120 }, 


(2.115) 


by both the modulus and phase factor. (In the WKB approximation, the latter factor always equals n!A.) 
Hence, again, the WKB approximation’s prediction is not exact for sharp potentials; nevertheless, it is 
broadly used for practical calculations. One of the most important of them is the transparency of an 
arbitrary but smooth potential barrier (Fig. 11). 



Fig. 2.11. ID potential barrier of 


Here, just as in the case of a rectangular barrier, we need to take unto consideration five partial 
‘waves” (or rather fundamental solutions of the Schrodinger equation): 30 


W WKB 


a 


expj /j k(x')£/x'j + -exp<j - z J k(x')dx' j>, for x < 

x 

J K{x')dx’ 


k uz (x) 

^%) exp 


k U2 (x) 
d 


K 1 '\X) 


exp< 


/ 


k lu (x) 


exp-j i^k(x')dx' L 


J/r(x')dx' forx c <x<x c ', (2.116) 
for x ' < x, 


where lower limits of integrals are arbitrary (each within the corresponding range of x). Since on the 
right of the left classical point we have two exponents rather than one, and on the right of the second 


30 Sorry, but the same letter, d, is used here for the barrier thickness (defined in this case as the classically 
forbidden region length, x c ’ - x c ), and the constant in one of the wave amplitudes - see Eq. (116). Let me hope 
that the difference between these uses is absolutely evident from the context. 
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point, one traveling waves rather than two, the connection fonnulas (105) have to be generalized, using 
asymptotic fonnulas not only for Ai(^), but also for the second Airy function, Bi(^). The analysis, 
absolutely similar to that carried out above (though naturally a bit more bulky), 31 gives a remarkably 
simple result: 


T = 

1 WKB 

/ 

2 

= exp< 

A' 

- 2 J K{x)dx 

> = exp- 

X c 

-- j(2 m[U(x)-E]) 1,2 dx 

”, 

Soft 

(2.117) * unnel , 

v ; barrier s 




{ X c \ 


ft X 

l C J 


transparency 


with no pre-exponential factor. This formula is broadly used in applied quantum mechanics, despite the 
approximate character of its pre-exponential coefficient for insufficiently soft barriers that do not satisfy 
Eq. (107). For example, Eq. (80) shows that for a thick rectangular barrier with k = k, i.e. Co = 2 E, the 
WKB approximation (117) underestimates T by a factor of 4. However, on the logarithmic scale of Fig. 
8b, such factor, about half an order of magnitude, still looks as a small correction. 

Notice that when E approaches the barrier top U max (Fig. 11), points x c and x c ’ merge, so that, 
according to Eq. (117), 7wkb — > 1, i.e. the particle reflection vanishes at E = U max . However, this 
conclusion is incorrect even for smooth barriers where one could naively expect the WKB 
approximation to work perfectly. Indeed, near point x = x m where the potential reaches maximum (i.e. 
U(x m ) = t/ max ), we may always approximate a smooth function U(x) by an inverted parabola, 


U(x) 


■ U mco l 


(•* ~ x,u f 


(2.118) 


2 2 

Calculating the derivatives dU/dx and d U/dx of this function and plugging them into condition (107), 
we see that the WKB approximation is only valid if |C max - E\ » Ticqq. An exact analysis 32 of tunneling 
through barrier (118) gives the following Kemble formula : 

1 1 qx Kemble 
formula 


valid for any sign of difference (E - C max ). This formula describes a gradual approach of T to 1, i.e. a 
gradual reduction of reflection at particle energy’s increase, with T = A (rather than 1) at E = U nrdX . 


1 + exp{- 2 tt(E - U max )!Tioj { X 


Now the last remark of this section: our discussions of the propagator and the WKB 
approximation open a straight way toward an alternative formulation of quantum mechanics, based on 
the Feynman path integral, but I will postpone its discussion until a more compact (“bra-kef ’) notation 
has been introduced in Chapter 4. 


2.5. Transfer matrix, resonant tunneling, and metastable states 

Fet us now explore motion in more complex potential profiles. The piecewise-constant and 
smooth-potential models of U{x) are not too convenient here, because they both require “stitching” local 


31 Note, however, that in the most important case 7 wkb « F Eq- (1 17) may be simply derived from Eqs. (105) - 
an exercise left for the reader. 

32 It was carried out by E. Kemble in 1935. Notice that mathematically the Kemble formula is similar to the Fermi 
distribution in statistical physics, with effective temperature T ef = hcoo/2/rk B . This similarity has some interesting 
implications for the statistics of Fermi gas tunneling. 
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solutions in each classical turning point, which may lead to very cumbersome calculations. However, we 
may get a very good insight of the physics phenomena in such profiles, using their approximation by a 
set of Dirac’s delta-functions. For that, let us have a look at what our old result (79) gives in the limit of 
a very thin and high rectangular barrier, d « 8, E « Uo (giving k « k« \/d): 


T = 




1 

l + cc 2 ’ 


( 2 . 120 ) 


where parameter a is defined as 



k 1 -k 

Kk 


2 A 


Kd « 


1 K 2 d 
2~ 


m 

Tdk 


U 0 d . 


( 2 . 121 ) 


The last product, U 0 d, is just the “area” 


^ = \u(x)dx (2.122) 

U(x)>E 

of the barrier. This fact implies that the very simple result (120) for the transparency may be correct for 
a barrier of any shape, provided that it is sufficiently thin and high. 

Indeed, let us consider the tunneling problem for a very thin barrier with Kd, kd « 1 (Fig. 12), 
approximating it by Dirac’s function: 

U(x) = ^S(x). (2.123) 


U(x) = x) 



A 

F 


F ^ 



4 B 


> 


0 x 


Fig. 2.12. Delta-functional tunnel barrier. 


We already know the solutions in all points but x = 0 - see Eqs. (63) and (77) - so we only need 
to analyze boundary conditions in that point to find coefficients A, B, and F - or rather the ratios B/A and 
FI A. However, due to the special character of the 5- function, we should be careful here. Indeed, instead 
of Eq. (68) we now get 

^(°)-^(°) = lim^ 0 \ d ^dx = lim^ 0 [u (x) - E]y/ dx = ^-^(0). (2. 124) 

—s -e 

On the other hand, the wavefunction itself is still continuous: 

¥ + (0) - <// (0) = lim^ 0 \^ dx = °- ( 2 - 125) 

—S 

Using these boundary conditions, we readily get the following system of two linear equations, 
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2 m'lti 

A + B = F, ikF - ( ikA - ikB ) = — - — F , 

fr 


whose solution yields 


B _ -ia 
A 1 + ia 


F_ 

~A 


1 


1 + ia 


where a = 


m'W 

~¥k' 


(2.126) 


(2.127) 


For the barrier transparency T = F/d| 2 , this result again gives Eq. (120). That formula may be recast to 
give a simple expression (valid only for E « U nvdX ) for the transmission coefficient, 



E 

E + E 0 


where E 0 


m'U >2 

m 2 ’ 


(2.128) 


that shows that as energy becomes larger than parameter Eq, the barrier’s transparency approaches unity. 


However, the most important application of Eqs. (126) is for deriving transparency of more 
complex potential profiles. For that, let us first introduce very general notions of the scattering and 
transfer matrices, currently for the ID case. Consider an arbitrary but finite-length potential “bump” 
(more formally called a scatterer ), localized somewhere between points x\ and X 2 , on the flat potential 
background, say U = 0 (Fig. 13). We know the general solution, with a certain energy E, outside the 
interval are a set of two sinusoidal waves. Let us present them in the form 




= Aj e 


ik(x—x .) 


+ Bje 


—ik(x—x .) 


(2.129) 


2 

where (for now) j = 1 or 2, and (Pik) /2m = E. Note that each of the wave pairs (129) has, in this notation, 
its own reference point x p because this is very convenient for the calculations which follow. 



Fig. 2.13. A single ID scatterer. 


As we have already discussed, if the wave/particle is incident from the left, the linear 
Schrodinger equation within the scatterer range (jci < x < X 2 ), can provide only linear expressions of the 
transmitted (A 2 ) and reflected (Bi ) wave amplitudes via the incident wave amplitude A \ : 

A 2 =S 2l A l , B x = S U A { , (2.130) 

where Su and S 21 are certain (generally, complex) coefficients. In this case, Bi = 0. Alternatively, if a 
wave, with amplitude ZC, is incident from the right, it also may induce a transmitted wave (B\) and 
reflected wave (A 2 ) with amplitudes 

B x = S 12 B 2 , A d = S 22 B 2 , (2.131) 

where coefficients S 22 and Sn are generally different from S\ i and S 2 i. Now we can use the linear 
superposition principle to argue that if waves A 1 and B 2 are simultaneously incident on the scatterer (say, 


Thin 

barrier’s 

transparency 


Chapter 2 


Page 28 of 76 


Essential Graduate Physics 


QM: Quantum Mechanics 


because wave B 2 has been partly reflected back by some other scatterer located at x > x 2 ), the resulting 
scattered wave amplitudes A 2 and B\ are just the sums of their values for separate incident waves: 

B\ = 5*11^1 + 1 S l2 B 2 , 

1 11 1 12 2 (2.132) 

A2 = S 2 i Aj + S 12 B^ ■ 


These linear relations may be conveniently presented by the so-called scattering matrix (frequently 
called just “S-matrix”): 


Scattering 

matrix: 

definition 


K A 2j 


= s 


\ B 2j 


s = 


X S 


12 


9 9 

9^21 °22 y 


(2.133) 


Scattering matrices, duly generalized, are an important tool for the analysis of wave scattering in 
more than one dimensions; for ID problems, however, another matrix is more convenient to present the 
same linear relations (132). Indeed, let us solve this system for A 2 and B 2 . The result is 


(2.134) 


where T is the transfer matrix with elements 


Transfer 

matrix: 

definition 


A 2 — T ll A 1 + T l2 B l , 

M 




i.e. 


= 1 



B 2 = ^21^1 + B 22 B l’ 

l^2 ) 


l^i ) 



Tn=S 21 - 


s u s 22 


12 


'12 




(2.135) 


One can wonder whether matrices S and T obey any universal properties that would be valid for 
an arbitrary (but time-independent) scatterer. Such universal equations may be readily found from the 
probability current conservation and the time-reversal symmetry of the Schrodinger equation. Let me 
leave finding these relations for reader’s exercise. The results show, in particular, that the scattering 
matrix may be rewritten in the following form: 


S = e 


id 


re 


iqj 


— re 


-up 


(2.136a) 


where 4 real parameters r, t, 0, and (p satisfy just one universal relation: 


2 , ,2 , 

r +t =1 


(2.136b) 


(so that only 3 of the parameters are independent). As a result of this symmetry, Tn may be also 
presented in a simpler form, similar to T 22 : T n = cxp \ i0\/t = l/.Sj 2 = I /S 2 1 . The last form allows a ready 
expression of scatterer’ s transparency via just one coefficient of the transfer matrix: 


T = 


A , 


A, 


b 2 = 0 




(2.137) 


In our current context, the most important property of ID transfer matrices is that in order to find 
the total transfer matrix T of a system consisting of several (say, N) sequential arbitrary scatterers (Fig. 
14), it is sufficient to multiply their matrices. Indeed, extending the definition (134) to other points x/ (j 
= 1,2, ..., N+ 1), we can write 
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(Af 


(Af 


'A' 


(Af 


(aA 


= T I 



= T, 

2 

— T 2 T i 

i 

\ B 2; 


< B \; 



2 

\ B 2; 


< B \j 


(2.138) 


etc. (where the matrix indices indicate the scatterers’ order on axis x ), so that 


^N + 1 
\ B N+ 1 J 


— T T T 

1 N 1 N-\ ••• 1 1 


\ B \j 


(2.139) 



Fig. 2.14. A sequence of several ID 
scatterers. 


But we can also define the total transfer matrix similarly to Eq. (134), i.e. as 


A/y + 1 

\ B N + 1 J 


= T 


r A' 

\ B \j 


(2.140) 


so that finally 


T j T 

1 l N l N- 


T 

i-.-ii . 


(2.141) 


This formula is valid even if the flat-potential gaps between component scatterers vanish, so that 
it may be applied to a scatterer with an arbitrary profile U(x), by fragmenting its length into small 
segments Ax = x !+ \ - Xj, and treating each fragment as a rectangular barrier of height (Ufr = [U(x j+ \) - 
U(xj)\/2 - see Fig. 15. Since very efficient numerical algorithms are readily available for fast 
multiplication of matrices (especially as small as 2x2), this approach is broadly used in practice for the 
computation of transparency of tunnel barriers with complicated profiles U(x). (This is much more 
efficient then the direct numerical solution of the Schrodinger equation.) 


U(x) 



Fig. 2.15. The transfer matrix approach 
to a long tunnel barrier of an arbitrary 
profile. 


In order to use this approach for several conceptually important systems, let us calculate the 
transfer matrices for a few elementary scatterers, starting from the delta- functional barrier located at x = 
0. Taking x\ = X 2 = 0, we can merely change the notation of wave amplitudes in Eq. (127) to get 


Transfer 
matrix of a 
composite 
scatterer 
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Sn = 


— ia 


S 2\ — 


1 


1 + ia " 1 + ia 

An absolutely similar analysis of the wave incidence from the left yields 

-ia „ 1 


^22 = 


1 -via 


S a = 


1 + ia 


(2.142a) 


(2.142b) 


and using Eqs. (135), we get 


Transfer 
matrix of a 

T - 

1 -ia -ia 3 


short 


ia 1 + ia j 


scatterer 




(2.143) 


Identity 

matrix 


The next example may seem strange at the first glance: what if there is no scatterer at all between 
points xi and x 2 ? If points xi and x 2 coincide, the answer is indeed trivial and can be obtained, e.g., from 


Eq. (143) by taking W= 0, i.e. a = 0: 


fl 

v0 



U 


(2.144) 


- the so-called identity matrix. However, we are free to choose the reference points xi, 2 participating in 
Eq. (129) as we wish. For example, what if X 2 - x\ = al Let us first take the forward-propagating wave 
alone: f? 2 = 0 (and hence B\ = 0); then 


Yi=Yx= 


ik(x—x x ) 


. ik(x~—x,) ik(x-xS) 

A x e 2 l e 2 


(2.145) 


Comparison of this expression with the definition (129) for j = 2 shows that Aj = A\ exp { /k(x 2 - xQ} = A \ 
exp {ika}, i.e. T n = exp {ika}. Repeating the calculation for the back-propagating wave, we see that r 22 = 
exp{-ika}, and since this “no-potential” (space interval) provides no particle reflection, we finally get 


Transfer 
matrix 
of a space 
interval 



(2.146) 


independently of the mutual position of points xi and x 2 . At a = 0, we naturally recover the special case 
(143). 


Now let us use these results to analyze the double-barrier system shown in Fig. 16. We could of 
course calculate its properties as before, writing down explicit expressions for all 5 traveling waves 
shown by arrows in Fig. 16, and then using boundary conditions (124) and (125) at each of points xiyto 
get a system of 4 linear equations, and then solving it for 4 amplitude ratios. 
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w^x-xj 


Wp(X-X 2 J 
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> 


Fig. 2.16. Double-barrier system. Dashed 
lines show (schematically) the position of 
metastable energy levels. 
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However, the transfer matrix approach simplifies the calculations, because we may immediately use 
Eqs. (141), (143), and (146) to write 


T = TTT 

a a a 


l — ia -ia 
ia 1 + ia 


( ika 
e 


\ f i \ 

w 1 -ia -ia 


—ika 


ia 


1 + ia 


(2.147) 


Let me hope that the reader remembers the “row by column” rule of the multiplication of square 
matrices; 33 using it for two last matrices, we reduce Eq. (147) to 


T = 


l—ia -ia 


ia 


1 + ia 


(1 -ia)e 
iae~ ika 


ika 


-iae 


ika ") 


(1 + ia)e 


—ika 


(2.148) 


Now there is no need to calculate all elements of the full product T, because, according to Eq. (137), for 
the calculation of barrier transparency T we need only one its element, T u \ 


r- 1 - 1 

W 2 

a 2 e~ ika +(1 -ia) 2 e ika 

2 ’ 


(2.149) 


This result is similar to that following from Eq. (79) for E > Uq\ the transparency is a ^-periodic 
function of the product ka, reaching the maximum (T= 1) at some point of each period - see Fig. 17a. 


(a) 



(b) 



Fig. 2.17. Resonant tunneling through a 
quantum well with delta-functional walls : 
(a) transparency a function of ka, and (b) 
calculating resonance’s FWHM at a » 1. 


However, the new result is different in that for a » 1, the resonance peaks of transparency are 
very narrow, reaching their maxima at ka ~ k n a = nn, with n = 1, 2, ... Physics of this effect is 
immediately clear from the comparison of this result with our analysis of the simplest quantum well - 
see Fig. 1.7 and its discussion. At k « k n , the incident wave, which undertakes multiple sequential 
reflections from the semi-transparent walls of the well, forms a nearly standing wave, which at a » 1 
virtually coincides with one of eigenfunctions of the well with infinite walls, with the standing wave 
amplitude much larger that that of the incident wave. As a result, the transmitted wave amplitude is 


N 

33 In the analytical form: (AB) /( , = ^ A jr ,B r! , , where A is the matrix rank (in our current case, N = 2). 

j "= i 
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proportionately increased. This is the famous effect of resonant tunneling , 34 in mathematical description 
identical to the resonant transmission of light through an optical Fabry-Perot resonator formed by two 
parallel semi-transparent mirrors. 35 

Probably, the most surprising feature of this system is the fact that its maximum transparency is 
perfect ( T max = 1) even at a — > oo, i.e. in the case of a very low transparency of each of two component 
barriers. 36 Indeed, the denominator in Eq. (149) may be interpreted as the squared length of the 
difference between two vectors, one of length a 2 , and another of length | (1 - ia) 2 | = 1 + a 2 , with angle 
6 = 2 ka + const between them. At the resonance, the vectors are aligned, and the difference is smallest 
(equal to 1) - see Fig. 17b, so that T max = 1. 

We can use the same vector diagram to calculate the so-called FWHM, the common acronym for 
the Full Width [of the resonance curve at] Half-Maximum, i.e. the difference Ak = k+ - k. between such 
two points on the opposite slopes of the same resonance, at which T = T nrdX /2 - see arrows in Fig. 17a. 
Fet the vectors in Fig. 17b be slightly misaligned, by an angle 6 ~ 1/a 2 « 1, so that the length of the 
difference vector (of the order of cc 6 ~ 1) is still much smaller than the length of each vector. In order to 
double its length squared, and hence reduce Tby a factor of 2 in comparison with its maximum value 1, 
the arc, oc'O, between the vectors should also become equal ±1, i.e. a 2 (2k±a + const) = ±1. Subtracting 
these two equations from each other, we finally get 

Ak = (k + -k_) = — « k + . (2.150) 

aa 


Now let us use the simple potential shown in Fig. 16 to discuss an issue of large conceptual 
importance. For that, consider what would happen if at some initial moment (say, t = 0) we have placed 
a ID quantum particle inside the double-barrier well with a » 1, and left it there alone, without any 
incident wave. To simplify the analysis, let us prepare the initial state so that it coincides with the 
ground state of the infinite- wall well - see Eq. (1 .76): 


2 

yaj 


v F(x,0) = \j/ x (x) = — sin[^j ( x - x x )], where k x = — . 


(2.151) 


At a — > oo, this is an eigenstate of the system, and from our analysis in Sec. 1.5 we know its time 
evolution: 

W(x,t) = y/fx)e U ° lt , with co x =^f = y , (2.152) 

n 2m 2ma ~ 

telling us that the particle remains in the well at all times with constant probability W(t) = W(0) = 1 , 37 
However, if parameter a is large but finite, the de Broglie wave should slowly “leak out” from 
the well, so that W(t) would slowly decrease. Fet us consider this effect approximately, assuming that 


34 In older literature, it is sometimes called the Ramsauer (or “Townsend”, or “Ramsauer-Townsend”) effect. 
However, it is currently more common to use that name(s) only for a similar 3D effect, especially at scattering of 
low-energy electrons on rare gas atoms - this is how it was first observed, independently, by C. Ramsauer and J. 
Townsend in the early 1920s. 

35 See also , e.g., EM Sec. 7.9. 

36 The exact equality r max = 1 is correct only if both component barriers are exactly equal. 

37 Probability W(t) should not be confused with the delta- functional barrier’s “area” F, defined by Eq. (122). 
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the slow leakage, with a characteristic time r» I / a >\ , does not affect the instant wave distribution 
inside the well, besides the reduction of W. 38 Then we can generalize Eqs. (151), (152) as follows: 


v F(x,Q = 


2 W 

k a ) 


sin^ (x-Xj)]]e 


-iwj 


(2.153) 


making the probability of finding the particle in the well equal to W. This solution may be presented as a 
sum of two traveling waves: 


f (1,0 = + Be <k 

with equal magnitudes of their amplitudes and probability currents 
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A = Iff = 


\2aj 


/ -iy 2 k I 

1 A — h 1 "-1 — ’ 1 B — 1 A- 

m m 2 a a 


(2.154) 


(2.155) 


But we already know from Eq. (128) that at a » 1 the delta-functional wall transparency T 
approximately equals 1/a 2 , so that the wave carrying current I a, incident on the right wall from inside, 
induces an outcoming waves outside of the well (Fig. 18) with the following probability current: 


Absolutely similarly, 



1 nhW 
a 2 2 ma 2 



(2.156a) 


(2.156b) 



Fig. 2.18. Metastable state’s decay in the simple model of a ID 
potential well with low-transparent walls - schematically. 


Now we may combine the ID version (6) of the probability conservation law for well’s interior, 


dW 

dt 


+ I R 


I, = 0 . 


(2.157) 


with Eqs. (156) to write 


dW _ 1 7th 

dt a 2 ma 2 


(2.158) 


38 This almost evident assumption finds its formal justification in the perturbation theory to be discussed in 
Chapter 6. 
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This is just the standard differential equation, 


(2.159) 


of the exponential decay, with solution W(t) = W(0)c\p l-t/ rj, where constant r, in our case equal to 


Metastable 
state’s 
decay law 



ma 2 2 

r = a" , 

7th 


(2.160) 


Metastable 

state’s 

lifetime 


is called the metastable state’s lifetime. Using expression (2.34) for the de Broglie waves’ group 
velocity, in our particular wave vector giving v gr = hk\/m = Tth/ma, Eq. (159) may be rewritten as 



(2.161) 


where in our case the attempt time t A is equal to a! v gr , and T = 1/a 2 . Relation (161), that is valid for a 
large class of metastable systems, 39 may be interpreted in the following semi-classical way. The 
confined particle travels back and forth between the confining walls, with time intervals t A between the 
moments of incidence, each time making an attempt to leak through the wall, with a success probability 
of T, so the reduction of W per each incidence is A W= -WT, immediately leading to Eq. (161). 


Another important look at Eq. (160) may be taken by returning to the resonant tunneling problem 
and expressing the resonance width (150) in terms of incident particle’s energy: 
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AE = A 
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h 2 k 2 " 

2m J 
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(2.162) 


Comparing Eqs. (160) and (162), we get a remarkably simple formula 

Energy-time 
uncertainty 
relation 


A E • T = h . 


(2.163) 


This so-called energy-time uncertainty relation is certainly more general than our simple model; 
for example, it is valid for the lifetime and resonance tunneling width of any metastable state. This 
seems very natural, since because of the energy identification with frequency, E = hco, typical for 
quantum mechanics, Eq. (163) may be rewritten as Ao>t = 1 and seems to follow directly from the 
Fourier transform in time, just as the Heisenberg’s uncertainty relation (1.35) follows from the Fourier 
transfonn in space. In some cases, these two relations are indeed interchangeable; for example, Eq. (24) 
for the Gaussian wave packet width may be rewritten as SE-At = h, where 8E = h{dcoldk)8k = hv gr Sk is 
the r.m.s. spread of energies of monochromatic components of the packet, while At = 8xl v gr is the time 
scale of the packet passage through a fixed observation point x. 

However, Eq. (163) it is much less general than Heisenberg’s uncertainty relation (1.35). Indeed, 
in nonrelativistic quantum mechanics, Cartesian coordinates (say, x) of a particle, components of its 
momentum (say, p x ), and energy E are regular observables, presented by operators. In contract, time is 
treated as a onumber argument, and is not presented by an operator, so that Eq. (163) cannot be derived 


39 Essentially the only requirement is to have the attempt time A t A to be much longer than the effective time 
( instanton time, see Sec. 5.3 below) of tunneling through the barrier. In the delta-functional approximation for the 
barrier, the latter time vanishes, so that this requirement is always fulfilled. 


Chapter 2 


Page 35 of 76 


Essential Graduate Physics 


QM: Quantum Mechanics 


in such general assumptions as Eq. (1.35). Thus the time-energy uncertainty relation should be applied 
with great caution. Unfortunately, not everybody is so careful. One can find, for example, wrong claims 
that due to this relation, the energy dissipated by any system performing an elementary (single-bit) 
calculation during time interval At has to be larger than hi At. 40 Another incorrect statement is that the 
energy of a system cannot be measured, during time At, with an accuracy better than hi At. 41 

Now let us use our simple model of metastable state’s decay for a preliminary discussion of one 
aspect of quantum measurements. Figure 18 shows (schematically) one of the traveling wave packets 
emitted by the quantum well after its initial state (152) had been prepared at t = 0. (A similar packet is 
emitted to the left.) At t » r, the well becomes essentially empty ( W « 1), and the whole probability 
distribution is localized in two clearly separated wave packets of equal amplitudes, moving from away 
with speed v gr , each “carrying the particle away” with a probability of 50%. Now assume an experiment 
has detected the particle on the left side of the well. Though the formalisms suitable for a quantitative 
analysis of the detection process will not be discussed until Sec. 7.7, due to the wide separation of the 
packets, we may safely assume that the detection may be done without any actual physical effect on the 
counterpart wave packet. 42 But if we know that the particle has been found on the left, there is no chance 
to find it on the right. 

If we attributed the wavefunction to all stages of this particular experiment, this situation might 
be rather confusing. Indeed, this would mean that the wavefunction within the right packet should 
instantly turn into zero - the so-called wave packet reduction - a process that could not be described by 
either Schrodinger equation or any other law of physics. However, if (as was already discussed in Sec. 
1 .3) we attribute the wavefunction to a statistical ensemble of similar experiments, there is no paradox 
here at all. While the two-packet picture we have calculated (Fig. 18) describes the full initial ensemble 
(regardless of the particle detection results), the “reduced packet” picture (with no wave packet on the 
right of the well) describes only a sub-ensemble of experiments with the particle detected on the left 
side. As was discussed on completely classical examples in Sec. 1.3, for such sub-ensemble the 
probability distribution, and hence the wavefunction, may be dramatically different. 


2.6. Coupled quantum wells 

Let us now move on to tunneling through a more complex potential profile shown in Fig. 19: a 
sequence of (N - 1) similar quantum wells separated by N similar delta-functional tunnel barriers. 
According to Eq. (141), its transfer matrix is the following product 


T = T T T . 

a a a 


T T 

9 4 a 4 a 


(N-l)+N terms 


(2.164) 


with the component matrices given by Eqs. (143) and (146), and the barrier height parameter a defined 
by the last of Eqs. (127). 


40 Here I dare to refer the reader to my own old work K. Likharev, Int. J. Theor. Phys. 21 , 311 (1982) that 
presented a constructive proof that at reversible computation (introduced in 1973 by C. Bennett) the energy 
dissipation may be lower than this apparent “quantum limit”. 

41 See, e.g., a detailed discussion of this issue in the monograph by V. Braginsky and F. Khalili, Quantum 
Measurement, Cambridge U. Press, 1992. 

42 This argument is especially convincing if the particle detection time is much shorter than the time t c = 2 v gT t/c, 
where c is the speed of light in vacuum, i.e. the maximum velocity of any information transfer (“signaling”). 
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Fig. 2.19. Resonant tunneling 
through a system of N similar, 
equidistant barriers, i.e. (N — 1) 
similar quantum wells. 


Remarkably, this multiplication may be carried out analytically, 43 giving 

Transparency 
of N 
equidistant 
tunnel 
barriers 


i i-2 

T — z _ 

(cos Nqaf + 

sin ka- a cos ka . , , ^ 

2" 

-1 

1 — Kn| 

bill I V Ci 

v sin qa y 




(2.165) 


where q is a new parameter, with the wave number dimensionality, defined by the following relation: 

cos qa = cos ka + a sin ka. (2.166) 


For N= l, Eqs. (165) and (166) immediately yield our old result (128), while for #= 2 they may be 
reduced to Eq. (149) - see Fig. 17a. Figure 20 shows its predictions for two larger numbers N, and 
several values of parameter a. 


N = 3 



Fig. 2.20. Transparency of the system shown i 
function T(ka) is ^-periodic (just like for N= 2, s< 


# = 10 



Fig. 19 as a function of product ka. Since the 
Fig. 17a), only one period is shown. 


Let us start discussion of the plots from case N = 3, i.e. two coupled quantum wells. The 
comparison of Fig. 20a and Fig. 17a shows that the transmission patterns, and their dependence on 
parameter a, are very similar, besides that in the coupled wells each resonant tunneling peak splits into 
two, with the ka-difference between them scaling as 1 la. In order to comprehend the physics of this 
important result, let us analyze an auxiliary system shown in Fig. 21: two similar quantum wells 


43 This formula will be easier to prove after we have discussed properties of Pauli matrices in Chapter 4. 
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confined by infinitely high potential walls at x = ±a, and coupled via a transparent, short tunnel barrier 
atx = 0. 



Fig. 2.21. Two lowest eigenfunctions and 
eigenenergies of a system of two coupled 
quantum wells - schematically. 


The barrier may be again, for calculation simplicity, approximated by a delta-function: 


U(x) = 


j + oo, for |x| > a, 
\rt8ix), for |x| < a. 


(2.167) 


We already know that the standing-wave eigenfunctions y/ n of the Schrodinger equation in regions with 
U(x) = 0, in our current case, segments -a < x < 0 and 0 < x < +a, may be always presented as linear 
superpositions of sin kx and cos kx. In order to immediately satisfy the boundary conditions y/ = 0 at x = 
±a, we can take these solutions in the form 


Vn W 


\C_ sinA:(x + a), for-o<x<0, 
[C + sin^(x-a), forO<x<+a. 


(2.168) 


What remains is to satisfy the boundary conditions at x = 0. Plugging Eq. (167) into Eqs. (124) and 
(125), we get the following system of two linear equations: 


2 in'ffl 

k{C,-C )cos ka = — —C sin ka, 

+ n 2 ~ 


(2.169) 


C sin ka = -C + sin ka . (2. 1 70) 

The system has two types of solutions, with the two lowest-energy eigenfunctions sketched in Fig. 21: 

(i) Antisymmetric solutions (which will be marked with index A), 

(C + ) A =(C_) A , he. y/ A =C A sink A x, (2.171) 

with eigenvalues independent of W, 

sin^a = 0, i .e. k A a = kna = mi, n = 1,2,... (2.172) 

Notice that these values of k, and hence eigenenergies of these antisymmetric states, 

n 2 k] 


e a = 


2 2 

n n 


2m 2ma‘ 


(2.173) 
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coincide with those of the simple quantum well of width a - see Fig. 1.7 and its discussion. 

(ii) Symmetric solutions (index S ): 

(C + ) s = -(C ) s , i.e. y/ s =C s |sin* s (x-a)|, (2.174) 


with Eq. (169) giving the following characteristic equation for constant ky 

Characteristic 
equation for 
two coupled 
quantum wells 

Figure 22 shows the graphic solution of this equation for three values of parameter a, i.e. for various 
quantum well coupling strength. For each solution, ksa is confined within interval 


tan k s a = — 


a 


(2.175) 


m < k^a < mi , 

5 2 


(2.176) 


so that the antisymmetric and symmetric states alternate on the scale of k (and hence of the energy), with 
the difference k A - ks, for each pair of adjacent states, smaller then n!2a for any value of a. The physics 
of the splitting between eigenenergies corresponding to the symmetric and antisymmetric states is very 
simple: it is the change of kinetic energy of the particle due to different confinement types - see Fig. 21. 
In each antisymmetric mode, y/ n (0) = i//„ (±a) = 0, i.e. the wavefunction is essentially confined within a 
segment of length a; as a result, its energy (173) does not depend on the barrier height. On the contrary, 
in the symmetric mode, that does engage the potential barrier, the wavefunction effectively spreads into 
the counterpart well. As a result, it changes slower, and hence its kinetic energy is also lower that that of 
the adjacent antisymmetric mode. 



a 

a 


a 


ka! n 


3 

1 

0.3 


Fig. 2.22. Graphical solution of the 
characteristic equation (175) for 
the eigenvalue of ka in the 
symmetric mode, for 3 values of 
parameter a, considering it 
independent of ka. The dashed line 
shows approximation (178). 


By the way, this problem may serve as a toy model of the strongest (and most important) type of 
atom cohesion - the covalent (or “chemical”) bonding in molecules, liquids, and solids. The classical 
example of such bonding is that of hydrogen atoms in a H 2 molecule. 44 Each of two electrons of this 
system 45 reduces its kinetic energy very substantially by spreading its wavefunction around both nuclei 


44 Historically, the development of the fully quantum theory of H 2 bonding by W. Heitler and F. London in 1927 
was the breakthrough decisive for the acceptance of then-emerging quantum mechanics by chemists. 

45 Due to the opposite spins, the Pauli principle allows them to be in the same orbital ground state - see Chapter 8. 
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protons, rather that being confined near one of them - as it had to be in a single atom. As a result, the 
bonding is very strong: in chemical units, 429 kJ/mol, i.e. 18.6 eV per molecule. 46 Somewhat counter- 
intuitive, this energy is substantially larger than the strongest classical {ionic) bonding due to electron 
transfer between atoms, leading to the Coulomb attraction of the resulting ions. (For example, the 
atomic cohesion in the NaCl molecule is just 3.28 eV.) 

In the limit a — > 0 (no partition between the wells), ksa — > Mn - 1/2), i.e. the eigenstates 
approach the shape and energy of symmetric states of a quantum well of width 2a. In the opposite limit 
a » 1, ksa — > 7m, and in the vicinity of each such point we may approximate tan ksa with {ksa - mi) - 
see the dashed line in Fig. 22. As a result, the characteristic equation (175) is reduced to 

k s a~mi- — , (2.177) 

a 


so that the splitting between the wave numbers and eigenenergies of the adjacent symmetric and 
antisymmetric states is small: 


k A ~k s * — « k n , 
aa 


1$, =E a -E s — = — 

dk ma aa mi a 


(2.178) 


(By construction, this result is valid only if a » 1, i.e. 8 n « E A « E s .) 

Let us analyze properties of the system in this limit in much more detail - first, because the 
results will help us to develop the important tight binding approximation in the band theory, and second, 
because the weakly coupled quantum wells will be our first example of very important two-level (or 
“spin- A-likc”) systems. Let us focus on one couple of symmetric and antisymmetric states, 
corresponding to virtually the same E n . According to Eqs. (171) and (174), in the limit a — > <x>, system’s 
eigenfunctions may be approximately represented as follows: 


¥s O) ~ Wr O) + ¥l (*)1 ¥ a 0) = [' ¥r 0) - ¥l (*)1 


(2.179) 


where i//r : l are the normalized ground states of the completely insulated wells: 


¥ r (x) = 


|°, 

|(2/a) 1/ “ sink 7i x, 


for - a < x < 0, 
for 0 < x < +a, 




for - a < x < 0, 
for 0 < x < +a. 


(2.180) 


Let us perform the following conceptually important thought experiment: place the particle, at t 
= 0, into one of the localized states, say y/R_{x), and leave the system alone to evolve. Solving Eqs (180) 
for i//r, we may present the initial state as a linear superposition of eigenfunctions: 


x F(x,0) = y/ R (x) « -j= [y/ s (x) + y/ A (x)] . 


(2.181) 


Now, according to the general solution (1.67) of the time-dependent Schrodinger equation, time 
dynamics may be obtained by just multiplying each eigenfunction by the corresponding factor (1.61): 


46 Unit reminder: 1 kJ/mol ~ 0.0434 eV. 
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X ¥(x,t) = ^ r 
V2 


Vs ( x ) ex pj + (x) expj - i^- f 


Now, introducing the following natural notation, 

E , + E c 


E. = 


_ ^ A 


8 = Ea Es 


(2.182) 


(2.183) 


And using Eqs. (179), this expression may be rewritten as 

1 


v F(x ,0 = 




Vs (x)exp|/^;i + i // 4 (x) expj - i ^ t\ 


ex P -,A« 


, . 8 n . . 8 „ 
y/ R (x) cos — t + 1 y/ L (x) sin — t 
ft ft 


“T'fT 


(2.184) 


This result implies, in particular, that the probabilities Wr and Wl to find the particle, correspondingly, 
in the right and left wells change with time as 

Quantum 
oscillations 
in two 
coupled 
wells 

mercifully leaving the total probability constant Wr + Wl = 1 . (If our calculation had not passed this 
sanity check, we would be in a big trouble.) 



(2.185) 


This is the famous effect of periodic quantum oscillations, with frequency co n = ISJft = (E A - 
Es)/ft, of the particle between two similar quantum wells, due to their coupling through via tunneling 
through the tu nn el barrier. The physics of this effect is straightforward: just as in the single well problem 
discussed in Sec. 5, the particle initially placed into a certain quantum well tries to escape from it via 
tunneling through the semi-transparent wall. However, in our current situation (Fig. 21) the particle can 
only escape into the adjacent well. After the tunneling into that second well, the tries to escape from it, 
and hence comes back, etc. - just as a classical ID oscillator, initially deflected from its equilibrium 
position. 

Maybe the most surprising feature of this effect is its relatively high frequency: according to Eq. 
(178), the time period of the quantum oscillations, 


2 n 2 7ih 2 n ma 1 

A t„= — = 7 r~ a, tor«»l, 

(E a ~E s ) n jfr 


(2.186) 


is a factor of a!2n» 1 shorter than the lifetime r (160) of the metastable state of the particle in a 
similar but single quantum well limited by delta-functional walls with similar parameter a. This is a 
very counterintuitive result indeed: the speed of particle tunneling into a similar adjacent well is much 
higher than that, through a similar barrier, to the free space! 

To see whether this result is an artifact of the delta- functional model of the tunnel barrier, let us 
calculate splitting 2 8 n for system of two similar, symmetric, soft quantum wells formed by a smooth 
potential profile U(x) = U(-x) - see Fig. 23. 
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Fig. 2.23. Weak coupling between two 
similar, soft quantum wells. 


If the barrier transparency is low, the quasi-localized wavefunctions <//r(x) and i///.(x) = i//r(-x) 
and their eigenenergies may be found approximately by solving the Schrodinger equations in one of the 
wells, neglecting tunneling through the barrier, but finding 8 n requires a little bit more care. Let us write 
the stationary Schrodinger equations for the symmetric and antisymmetric solutions in the fonn 

[E a -U{x)] ¥a [e s -U(x)] Vs =-i- (2.187) 

2m dx 2 m ax 


then multiply the former equation by i//y, the latter one by y/ A , subtract them from each other, and 
integrate the result from 0 to 00 : 


(E A -E s )\y/ s y/ A dx 

0 


h_ 

2m 


2 00 


d y/ s 
dx 2 


¥a 




(2.188) 


If U(x), and hence d 2 y/ A ,sldx 2 , are finite for all x, 47 we may integrate the right-hand side by parts to get 


( E a -E s )\y/s¥ A dx 

0 


fi 2 
2m 



d ¥ 1 
dx 


¥s ■ 
~ 0 


(2.189) 


So far, this is an exact equation. For weakly coupled wells, we can do more. In this case, the left 
hand side may be approximated as ( E A - E s )/2 = 8 n , because the integral is dominated by the vicinity of 
point a, where the second terms in each of Eqs. (179) are negligible, and the integral is equal to Vi, due 
to the proper normalization of function i//r(x). In the right-hand side, the substitution at x = 00 vanishes 
(due to the wavefunction decay in the classically forbidden region), and so does the first term at x = 0, 
because for the antisymmetric solution y/ A ( 0) = 0. As a result, we get 


5 n 

2m dx 


"(0 > = — MoA" 

m 


(0) = — ¥l (0)^ 

dx m dx 


«(0) = -Lmo)^- 

m dx 


(0). (2.190) 


It is straightforward to show that within the limits of the WKB approximation validity, Eq. (190) 
may be reduced to 

WKB 
result 

(2.191) for 

coupling 
energy 



47 Since it is not true for potential (167), one should not be surprised that the resulting Eq. (189) is invalid for our 
initial problem, giving S„ twice larger than the correct expression (178). 
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where (a is the time period of classical motion of the particle inside one of the wells, function k(x) is 
defined by Eq. (97), and x c and x c ’ are the classical turning points limiting the potential barrier at the 
level E„ of particle’s energy - see Fig. 23. Comparing this result with Eq. (117), we can notice that 
again, just as in the case of the delta-functional barriers, the transmission coefficient T of a tunnel barrier 
(and hence the reciprocal lifetime of a metastable state in a potential well separated by such a barrier 
from a continuum) scales as the square of the WKB exponent participating in Eq. (191), so that the 
period of quantum oscillations between the well is much smaller than the lifetime. We will return to the 
discussion of this result, in a more general fonn, in Chapter 5. 

Returning for a second to Fig. 20a, we may now readily interpret the results for tunneling 
through the double quantum well: each pair of resonance peaks of transparency corresponds to the 
alignment of incident particle’s energy with the pair of energy levels Ea, Es of the symmetric and 
antisymmetric states of the system. 


2.7. ID band theory 

Let us now return to Eqs. (165) and (166) describing the resonant tunneling, and discuss their 
predictions for larger N— see, for example, Fig. 20b. We see that the increase of N results in the increase 
of the number of resonant peaks per period to (N - 1), and at N — » oo the peaks merge into the so-called 
allowed energy bands (frequently called just the “energy bands”) of relatively high transparency, 
separated from similar bands in the adjacent periods of function T(ka) by energy gaps 48 where E — > 0. 
Notice the following important features of the pattern: 

(i) at N — » oo, the band/gap edges become sharp for any a, and tend to fixed positions 
(detennined by a but independent of N)\ 

(ii) the larger interwell coupling (a — > 0), the broader the allowed energy bands and narrower the 
gaps between them. 

Our discussion of resonant tunneling in the previous section gives us an evident clue for a semi- 
quantitative interpretation of this pattern: if (N - 1) quantum wells are weakly coupled by tunneling 
through the tunnel barriers separating them, system’s energy spectrum consists of groups (TV — 1) energy 
levels. Each level corresponds to an eigenfunction that is the set of similar local functions in each well, 
but with certain phase shifts A cp between them. It is natural to expect that, just as for 2 coupled wells ( N 
-1=2), that at the upper level, Acp = /zfithus providing the highest confinement), with ka —> mi at a — » 
oo, while at the lowest level all A cp = 0, providing the most loose confinement. 49 However, what about 
A cp for other levels? 

Answers to all these questions are easy to get in the most important limit N — > oo, i.e. for periodic 
structures - which are, in particular, good ID approximations for solid state crystals, whose samples 
may feature more than 10 10 similar atoms or molecules in each direction of the crystal lattice. It is 
almost self-evident that at N — > oo, due to the translational invariance of U(x), 

U(x + a) = U(x), (2.192) 


48 In solid state (especially semiconductor) physics and electronics, term bandgaps is more common. 

49 This expectation is implicitly confirmed by Fig. 20: at a » 1, the highest resonance peak in each group tends 
to ka = m, and the lowest peak also tend to a position independent of N (though dependent on a). 
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the phase shift Ary between local wavefunctions in all adjacent quantum wells should be the same for 
each period of the system, i.e. 

y/(x + a) = y/(x)e l ^ (p (2.193a) 


for all x. (A reasonably fair classical image of A cp is the geometric angle between similar objects - e.g., 
similar paper clips - attached at equal distances to a long, uniform rubber band. If the band’s ends are 
twisted, the twist is equally distributed between the structure’s periods, representing the constancy of 
A<y. 50 ) 


Equation (193 a) is the (ID version of the) much-celebrated Bloch theorem. 51 Mathematical rigor 
aside, 52 it is a virtually evident fact, because the particle’s density w{x) = i/F(x) i/Ax), that has to be 
periodic in this a-periodic system, may be so only A cp is constant. For what follows, it is more 
convenient to present the real number A cp in the fonn qa (there is no loss of generality here, because 
parameter q may depend on a as well as other parameters of the system), so that the Bloch theorem takes 
the form 


y/(x + a) = if/(x)e ,qa . 


(2.193b) 


The physical sense of parameter q will be discussed in detail below; for now just note that according to 
Eq. (193b), an addition of (2 n/a) to it yields the same wavefunction; hence all observables have to be 
(2;r/a)-periodic functions of q . 53 


Now let us use the Bloch theorem to find eigenfunctions and eigenenergies for a particular, and 
probably the simplest periodic function U(x ): an infinite set of similar quantum wells separated by delta- 
functional tunnel barriers (Fig. 24). 


a a a 

< X >< > 



Fig. 2.24. The simplest periodic potential: 
an infinite set of similar, equidistant, 
delta-functional tunnel barriers. 


50 I am ashamed to confess that, due to the lack of time, this was virtually the only “lecture demonstration” in my 
QM courses. 

51 Named after F. Bloch who applied this concept to wave mechanics in 1929, i.e. very soon after its formulation. 
Admittedly, in mathematics, an equivalent statement, usually called the Floquet theorem, has been known since at 
least 1883. 

52 I will address this rigor in two steps. Later in this section, we will see that the function obeying Eq. (193) is 
indeed a solution of the Schrodinger equation. Flowever, to save time/space, it will be better for us to postpone the 
proof that any eigenfunction of the equation, with periodic boundary conditions, obeys the Bloch theorem, until 
Chapter 4. As a partial reward for the delay, that proof will be valid for an arbitrary spatial dimensionality. 

53 Product hq, which has the dimensionality of momentum, is called either the quasi-momentum or (especially in 
the solid state physics) the “crystal momentum” of the particle. 
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To start, consider two points separated by distance a: one of them, xj, just left of position of one 
of the barriers, and another one, x /+ i Just left of the following barrier. Eigenfunctions in each of the 
points may be presented as linear superpositions of two simple waves exp {±ikx}, and amplitudes of their 
components should be related by a 2x2 transfer matrix T of the potential fragment separating them. 
According to Eq. (141), this matrix may be found as the product of the matrix (146) of one interval a 
and the matrix (143) of one barrier: 


(A. 
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= T T 
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V J +l J 
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(2.194) 


However, according to the Bloch theorem (193b), the component amplitudes should be also related as 

(2.195) 
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The condition of self-consistency of these two equations leads to the following characteristic equation: 
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(2.196) 


In Sec. 5, we have already calculated the matrix product participating in this equation - see Eq. 
(148). Using it, we see that Eq. (196) is reduced to the same simple Eq. (166) that has already jumped at 
us from the solution of the different (resonant tunneling) problem. Let us explore that simple result in 
detail. First of all, the right hand part of Eq. (166) is a sinusoidal function of ka, with amplitude (1 + 
a 2 ) 1/2 - see Fig. 25, while its left hand part is a sinusoidal function of qa with amplitude 1. 


gap gap 



Fig. 2.25. Graphical solution of the characteristic 
equation (166) for a fixed value of parameter a. The 
ranges of ka that yield with | cos qa | <1, correspond to 
the allowed energy bands, while those with | cos qa \ > 1 , 
to gaps between them. 


As a result, within each period A (ka) = In, the characteristic equation does not have a real 
solution for q inside two intervals of ka - and hence inside two intervals of energy E = h k Urn. (These 
intervals are exactly the energy gaps mentioned above, while the complementary intervals of ka and E, 
where a real q exists, are the allowed energy bands.) In contrast, parameter q can take any real values, so 
it is more convenient to plot the eigenenergy E = h k Urn as the function of q (or, even more 
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conveniently, qa) rather than ka. 54 While doing that, we need to recall that parameter a, defined by the 
last of Eqs. (127), depends on wave vector k as well, so that if we vary q (and hence k), it is better to 
characterize the structure by a different, ^-independent dimensionless parameter, for example 


P = (ka) a = 


ma'W 


(2.197) 


so that Eq. (166) becomes 

, _ sin ka 

cos qa = cos ka + p 

ka 


(2.198) 


Characteristic 
equation 
for system 
in Fig. 24 


Figure 26 shows the plots of E and k , following from Eq. (198), for a particular, moderate value 
of parameter /?. The band structure of the energy spectrum is apparent. Another evident feature is the 
2 ^--periodicity of the pattern, that we have already predicted from the general Bloch theorem arguments. 
(Due to this periodicity, the complete band/gap pattern may be studied on just one interval -n< qa < + 
k, called the 1 st Brillouin zone - the so-called reduced zone picture. For some applications, however, it 
is more convenient to use the extended zone picture with -oo <qa< +oo - see, e.g., the next section.) 


1 st Brillouin zone (a) 



qa! In 



Fig. 2.26. (a) “Real” momentum k of a particle in the periodic delta-functional potential profile shown in 
Fig. 24, and (b) its energy E = trltllm (in units of E 0 = h 2 /2ma 2 ), as functions of the quasi-momentum q, 
for a particular value (/? = 3) of the dimensionless potential parameter ft = (ka) a = m Walk 2 . Arrows in the 
lower right comer of panel (b) illustrate the definition of the energy band (A E„) and energy gap (A„) widths. 


54 Perhaps a more important reason for taking q as the argument is that for motion in a general potential U(x), 
particle’s momentum tik is not a constant of motion, while (according to the Bloch theorem), the quasi-momentum 
hq is. 
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However, maybe the most surprising fact, clearly visible in Fig. 26, is that there is a infinite 
number of energy bands, with different energies E n (q ) for the same value of q. Mathematically, it is 
evident from Eq. (198) - see also Fig. 25. Indeed, for each value of qa there are two solutions ka to this 
equation on each period A (ka) = 2 n - see also panel (a) in Fig. 26. Each of such solutions gives a 
different value of particle energy E = h k 12m. A continuous set of similar solutions for various qa 
forms a particular energy band. 

Since the band theory is one of the most vital results of quantum mechanics, it is important to 
understand the physics of these different solutions - and hence of the whole band picture. For that, let us 
explore analytically two different potential strength limits. An important advantage of this approach is 
that both analyses may be carried out for an arbitrary periodic potential U(x), rather than for the simplest 
model shown in Fig. 24. 

(i) Tight-binding approximation . This approximation is sound when eigenenergy E n is much 
lower than the height of the potential barriers separating the potential minima (serving as quantum 
wells) - see Fig. 27. As should be clear from our discussion in Sec. 6, the wavefunction is mostly 
localized in the classically allowed regions at points xj of the potential energy minima - see the dashed 
lines in Fig. 27. Essentially the only role of coupling between these quantum well states (via tunneling 
through the separating barriers) is to establish certain phase shifts A cp = qa between the pairs of adjacent 
quasi-localized wavefunction “lumps” u(x - xj) and u(x - x r \ ). 



Fig. 2. 27. Tight binding 
approximation (schematically). 


To describe this effect quantitatively, let us first return to the problem of two coupled wells 
considered in Sec. 6, and recast result (184) as 


V F„ (x, t) = [a R (i t)y/ R (x) + a L (t)ij/ L (x)] exp j - i 1\, 


where functions qr and ai, oscillate sinusoidally in time: 


8 8 

a R (t) = cos — t, a L (t) = ism—t. 

h h 


(2.199) 


( 2 . 200 ) 


This evolution satisfies the following system of two equations whose structure reminds Eq. (1.59): 

( 2 . 201 ) 


iha R =-8 n a L , iha L =-8 n a R . 


Later in the course (in Chapter 6) we will prove that such equations are indeed valid, in the tight- 
binding approximation, for any system of two coupled quantum wells. These equations may be readily 
generalized to the case of many similar coupled wells. Here, in this case, instead of Eq. (199), we 
evidently should write 
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'F,, (x, t ) = expj- i ^-t a j (t)u n (x - x y ) , (2.202) 

where Ej, are the eigenenergies, and u n the eigenfunctions of each isolated well. In the tight binding 
limit, only the adjacent wells are coupled, so that instead of Eq. (201) we should write an infinite system 
of similar equations 

iMj = ~ 8 n a ] X - S n a j+] , (2.203) 


for each well number j, where parameters 8 „ describe the coupling between two adjacent quantum wells. 
Repeating the calculation outlined in the end of Sec. 6 for our new situation, we get the result essentially 
similar to the last form of Eq. (190): 



(2.204) 


Tight 

binding 

limit: 

coupling 

energy 


where xo is the distance between the well bottom and the middle of the tunnel barrier on the right of it - 
see Fig. 27. The only substantial new feature of this expression in comparison with Eq. (190) is that the 
sign of 8 „ alternates with the level number n: 8 \ > 0, 82 < 0, S 3 > 0, etc. Indeed, the number of “wiggles” 
(formally, zeros) of eigenfunctions w„(x) of any potential well increases as n - see, e.g., Fig. 1.7, 55 so 
that the difference of the exponential tails of the functions, sneaking under the left and right barriers 
limiting the well also alternates with n. 


The infinite system of ordinary differential equations (203) allows one to explore a large range 
of important problems (such as the spread of the wavefunction that was initially localized in one well, 
etc.), but our main task now is to find its stationary states, i.e. the solutions proportional to exp{- 
i(s n ITi)t}, where s n is a still unknown, ^-dependent addition to the background energy E n of n - th level. In 
order to satisfy the Bloch theorem (193) as well, such solution should have the form 


a j (t) = a expj iqXj -i-£-t + const > , 


(2.205) 


where a is a constant. Plugging this solution into Eq. (203) and canceling the common exponent, we get 

cos qa , (2.206) 


E = E+ s n = E n - 8 n (e iqa + e iqa )=E - 28 n 

n n n n \ ) n n 


so that in this approximation, the energy band width A E n (see Fig. 26b) equals 4| 8 n |. 


Tight 

binding 

limit: 

energy 

bands 


Relation (206), whose validity is restricted to | 8 „ \ « E n , describes the particular lowest energy 
bands plotted in Fig. 26b reasonably well. (For larger fi the agreement would be even better.) So, this 
calculation explains what the energy bands really are - in the tight binding limit they are best interpreted 
as isolated well’s energy levels E n , broadened into bands by the interwell interaction. Also, this result 
gives a clear proof that the energy band extremes correspond to qa = 2 7tl and qa = 2MJ + Vi), with 
integer /. Finally, the sign alteration of the coupling coefficient 8 n (204) with number n explains why the 
energy maxima of one band are aligned, on the qa axis, with energy minima of the adjacent bands. 


55 Below, we will see several other examples of this behavior. This alternation rule is also in accordance with the 
Bohr-Sommerfeld quantization condition 
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Bloch 

theorem: 
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(ii) Weak-potential limit . Surprisingly, the energy band structure is also compatible with a 
completely different physical picture that can be developed in the opposite limit. Let energy E be so 
high that the periodic potential U(x ) may be treated as a small perturbation. Naively, we would have the 
parabolic dispersion relation between particle’s energy and momentum. However, if we are plotting 
energy as a function of q rather than k, we need to add Irclla , with arbitrary integer /, to the argument. 
Let us show this by expanding all variables into the spatial Fourier series. For a periodic potential 
energy U(x) such an expansion is straightforward: 56 

U(x) = Y J U r exp j-/ — /" j, (2.207) 

where the summation is over all integers l”, from - oo to + oo. However, for the wavefunction we should 
show due respect to the Bloch theorem (193). To understand how to proceed, let us define another 
function 


w(x) = y/(x)e 


-iqx 


(2.208) 


and study its periodicity: 


u(x + a) = y/(x + a)e 


- iq{x+a ) _ 


= y/( x )e iqX = u(x) . 


(2.209) 


We see that the new function is o-periodic, and hence we can use Eqs. (208)-(209) to rewrite the Bloch 
theorem as 


y/(x) = u(x)e iq , with u(x + a) = u(x) . 


Now it is safe to expand the periodic function u(x) exactly as U(x): 

u{x) = expj-z-^-/'L 


( 2 . 210 ) 


( 2 . 211 ) 


so that, according to the Bloch theorem, 


. 2m 


2 n 


y/{x) = e iqx 'Y_ l u v ex pj - i 1 ' [ = ex Pi z <7 /' 


a j 


( 2 . 212 ) 


The only nontrivial part of plugging this expression into the stationary Schrodinger equation (61) is the 
calculation of the product term, using expansions (207) and (211): 


U(x)y/ = 'U r u r exp< 


l',l" 




2m 

q 

V a 


(/' + /") 


(2.213) 


At fixed / ’, we may change summation over / ” to that over / = /’ + /” (so that l” = 1—1’), and write: 


U (x)y = ^ expj/^y - ^ / jW u,,U . 


(2.214) 


56 The benefits of my unusual choice of the summation index (/” instead of, say, 1) will be clear in a few lines. 
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Now plugging Eqs. (212) (with index /’ now replaced by /) and (214) into the stationary Schrodinger 
equation (61), and requiring the coefficients of each spatial exponent to match, we get an infinite system 
of linear equations for w/: 57 


Z V,.r»r = 


E — 


2m 


2 n 




q / 

a j 


(2.215) 


So far, this system is an equivalent alternative to the initial Schrodinger equation - and, by the 
way, is very efficient for fast numerical calculations, for virtually any potential strength, though in 
systems with tight binding it may require taking into account a large number of harmonics u/. In the 
weak potential limit, i.e. if all the Fourier coefficients U„ are small, 58 we can complete all the 
calculation analytically. 59 Indeed, in the so-called 0 th approximation we can ignore all U n , so that in 
order to have at least one ui different from 0, Eq. (215) requires that 


E^E,= 

2m 


2 nl 
a J 


(2.216) 


{ui itself should be obtained from the normalization condition). This result means that the dispersion 
relation E{q) has an infinite number of similar quadratic branches numbered by integer / - see Fig. 28. 



qa!2n 


Fig. 2.28. ID band picture in the 
weak potential case (A„ « £ < '°). 
Shading shows the 1 st Brillouin zone. 


On any branch, the eigenfunction has just one Fourier coefficient, i.e. presents a monochromatic 
traveling wave 


Yi — » u t e lkx = u, exp] i 


V 


2ri \ 

a , 


(2.217) 


57 Note that we have essentially proved that the Bloch wavefunction (210) is indeed a solution of Eq. (61), 
provided that the quasi-momentum q is selected in a way to make the system of linear equation (2 1 5) compatible, 
i.e. is a solution of its characteristic equation - see, e.g., Eq. (223) below. 

58 Besides the constant potential U 0 that, as we know from Sec. 2, may be included into energy in a trivial way, so 
that we may take C/ 0 — 0. 

59 This method is so powerful that its multi-dimensional version is not much more complex than the ID version 
described here - see, e.g., Sec. 3.2 in the classical textbook by J. M. Ziman, Principles of the Theoiy of Solids, 2 nd 
ed., Cambridge U. Press, 1979. 


Chapter 2 


Page 50 of 76 


Essential Graduate Physics 


QM: Quantum Mechanics 


Weak 
potential 
limit: 
energy gap 
positions 


Weak 

potential 

limit: 

energies 

near 

bandgap 


This fact allows us to rewrite Eq. (215) in a more transparent form 

'y'.U/'-iUr — (E — E, )u , , 

/V/ 


that may be formally solved for uf. 


u, 


1 


E-E, 


Y,Ur-i u n ■ 

r*i 


(2.218) 


(2.219) 


If the Fourier coefficients U n are nonvanishing but small, this fonnula shows that wavefunctions do 
acquire other Fourier components (besides the main one, with the index corresponding to the branch 
number), but these additions are all small, besides narrow regions near the points Ei = Ey where two 
branches (216) of the dispersion relation E(q), with some specific numbers / and / ’, cross. This happens 
when 


^ 2n ' 


f ^ 

q / 

« — 

q V 

l a 7 


l a ) 


( 2 . 220 ) 


i.e. at q « q m = mn/a (with integer m = 1 + I’) 60 corresponding to 


E r 


-E-iUv+n- 2m]- 

2 ma 


n 2 n 2 2 

Y n 

2 ma 


E (n) , 


( 2 . 221 ) 


with integer n = l — V. (Equation (221) shows that index n is just the number of the branch crossing on 
the energy scale - see Fig. 28.) In such a region, E has to be close to both E/ and Er, so that the 
denominator in just one of the infinite number of terms in Eq. (219) is very small, making the term 
substantial despite the smallness of U n .. Hence we can take into account only one term in each of the 
sums (written for / and / ’): 


U_ n u i . — ( E E, )W/, 
U n u, =(E-E v )u r . 


( 2 . 222 ) 


Taking into account that for any real function U(x ) the Fourier coefficients in series (207) have to be 
related as U. n = U„ *, Eq. (222) yields the following simple characteristic equation 


E-E, -U n 
-U. E-E„ 


= 0 . 


(2.223) 


with solution 



(2.224) 


According to Eq. (216), close to the branch crossing point q m = tAI + l’)/a, the fraction 
participating in this result may be approximated as 61 


60 Let me hope that the difference between this new integer and particle’s mass, both called m, is absolutely clear 
from the context. 

61 Physically, /3!h = h{nrda)m = fik Ul) /m is just the velocity of a free classical particle with energy E" ] . 
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E,-E r _ . , dE, 

*yq, with y = — — 

2 dq 




2F <n) 

= , and q = q - q m , (2.225) 

ma mi 


while parameters E ave = (£) + E r )/2 = E (n) and U n U*„ = I U„ P do not depend on q , i.e. the distance from 
the central point q m . This is why Eq. (224) may be plotted as the famous level anticrossing (also called 
“avoided crossing”, or intended crossing”, or “non-crossing”) diagram (Fig. 29), with the energy gap 
width A„ equal to 2 I U„ I, i.e. just double the magnitude of the n - th Fourier harmonic of the periodic 
potential U{x). Such anticrossings are also clearly visible in Fig. 28 that shows the results of the exact 
solution of Eq. (198) for /?= 0.5. 62 



We will run into the anticrossing diagram again and again in the course, notably at the discussion 
of spin. Such diagram characterizes any quantum systems with two weakly-interacting eigenstates with 
close energies. It is also repeatedly met in classical mechanics, for example at the calculation of 
eigenfrequencies of coupled oscillators. 63 ’ 64 In our current case of the weak potential limit, the diagram 
describes the weak interaction of two sinusoidal de Broglie waves (216), with oppositely directed wave 
vectors, / and -/’ , via the (/ - /’) th (i.e. n th ) Fourier harmonic of the potential profde U(x). This effect 
exists also for the classical wave theory, and is known as the Bragg reflection, describing, for example, 
the ID case of the wave reflection by a crystal lattice (Fig. 1.5) in the limit of weak interaction between 
the incident particles and the lattice. 

Returning for the last time to our initial result - the band structure for the delta-functional U{x) 
(Fig. 24), shown in Fig. 26, we may wonder how general it is, taking into account the peculiar properties 
of the delta-function approximation. A partial answer may be obtained from the band structure for two 
more realistic and relatively simple periodic functions U{x)\ the sinusoidal potential (Fig. 30a) and the 
rectangular Kronig-Penney potential shown in Fig. 30b. 

For the sinusoidal potential (Fig. 30a), with U{x) = U\COs{2nx/a), the stationary Schrodinger 
equation (61) takes the form 


62 From that figure, it is also clear that in the weak potential limit, width A E n of the n-th energy band is just B n) - 
E { " " 1 1 - see Eq. (221). Note that this is exactly the distance between adjacent energy levels of the simplest ID 
quantum well of infinite depth - cf. Eq. (1.77). 

63 See, e.g., CM Sec. 5.1 and in particular Fig. 5.2. 

64 Actually, we could obtain this diagram earlier in this section, for the system of two weakly coupled quantum 
wells (Fig. 23), if we assumed the wells to be slightly dissimilar. 
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2 m dx 2 

By the introduction of dimensionless variables 


h 2 d 2 w TT 2m 

+ tv j cos if/ = E iff . 


„ m E 

% = ’ a = ~E 


2 fi. U ' 


( 1 ) 


(2.226) 


(2.227) 


a /s w E K 

where E {1) is defined by Eq. (221), Eq. (226) may be reduced to the canonical fonn of the well-known 
Mathieu equation 65 


- + {a-2p cos 2%)y/ = 0. 


(2.228) 



Fig. 2.30. Two simple periodic potential 
profiles: (a) the sinusoidal (“Mathieu”) 
potential and (b) the Kronig-Penney 
potential. 


Figure 31 shows the so-called characteristic cw~ves of the Mathieu equation, i.e. the relations 
between parameters a and fl corresponding to the energy band edges separating them from the adjacent 
bands. (Such curves may be readily calculated numerically, for example, using Eqs. (215) with the band- 
edge values qa = 0 and qa = re). In such “phase plane” plots, the detailed information about the energy 
dependence on the quasi-momentum is lost, but we already know from Fig. 26 that the dependence is 
not too eventful. The most remarkable feature of these plots is the fast (exponential) disappearance of 
the allowed energy bands at 2(3> a (in Fig. 31, above the red dashed line), i.e. at E < U\. This may be 
readily explained by our tight-binding approximation result (206): as soon as the eigenenergy drops 
significantly below the potential maximum U max = U\ (see Fig. 30a), quantum states in the adjacent 
potential wells are only connected by tunneling through the separating potential barriers, with 
exponentially small amplitudes S„ - see Eq. (204). 

On the other hand, the characteristic curves below the dashed line, i.e. at 2(3 < a, correspond to 
virtually free motion of the particle with energy E above f/ max = U\. Naturally, in this region the energy 
bands rapidly expand while gaps virtually disappear. This could be expected from the weak potential 
limit analysis (see Fig. 28 and its discussion); however, based on that analysis one could expect that the 


65 This equation, first studied in the 1860s by E. Mathieu in the context of a rather practical problem of vibrating 
elliptical drumheads (!), has many other important applications in physics and engineering, notably including the 
parametric excitation of oscillations - see, e.g., CM Sec. 4.5. 
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energy gaps A„ « 2 I U n \ would disappear more gradually. The fast decline of the gaps at U\ — » 0 (i.e. jd 
— > 0) in the Mathieu equation is an artifact of the sinusoidal potential U(x), with no Fourier harmonics 
U n above the first one. (In order to calculate the correct asymptotic behavior A„ oc f? at fd — > 0, one 
needs to go beyond the first approximation we have used in the weak potential limit analysis.) 


1 

2 


a! 4 




Fig. 2.31. Characteristic curves of the 
Mathieu equation. In application to the band 
theory, dotted regions correspond to the 
energy gaps, while regions between them, to 
energy bands. The red dashed line 
corresponds to condition a = 2/?, i.e. E = U\ 
= C'max, separating the regions of tunneling 
and over-barrier motion. Figure adapted from 
http://www.enm.bris.ac.uk/teaching/ . 


If one wants to study the details of transition between the two limits in the ID band theory 
without the artifacts of the delta- functional model shown in Fig. 24 (with infinite number of harmonics 
U n independent of n) and of the Mathieu equation (with all U„ = 0 for n ^ ±1), the standard way is to 
examine the Kronig-Penney potential shown in Fig. 30b. For this potential, the characteristic equation 
may be readily derived using our rectangular barrier analysis in Sec. 3. For the case E < Uo, the result is 
the following natural generalization of Eq. (166): 


cos qa = cosh xd cos k(a-d) + 



Kj 


sinh Kd sin k(a - d ) , 


(2.229) 


where parameters k and k are defined, as functions of E and Uo, by Eqs. (62) and (65). In the opposite 
case E > Uo, one can use the same formula with the replacement (73). Plots E(q), described by these 
formulas, 66 are very similar to those shown in Figs. 26b and 28 above. In order to see some difference, 
one needs to plot the characteristic curves Uo(E). This may be done by taking qa = 0 and qa = tc (i.e. 
cos qa = ±1) in Eq. (229), and solving the resulting transcendent equation for Uo numerically. The curves 
are generally similar to those shown in Fig. 31, but, in accordance with Eq. (224), exhibit a more 
gradual decrease of energy gaps: 

A„ ->2|C7 n W—, at E ~ E (n) » U 0 . (2.230) 

n 

To conclude this section, let me address the effect of periodic potential on the number of 
eigenstates in ID systems of large but finite length / » a, k A . Surprisingly, the Bloch theorem makes 
the analysis of this problem elementary, for arbitrary U(x). Indeed, let us assume that / is comprised of 


66 Such plots, for several particular values of parameters, may be found, for example, in Figs. 8.11-8.13 of E. 
Merzbacher’s textbook cited above. 
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an integer number of periods a, and its ends are described by the similar boundary conditions - both 
assumptions evidently inconsequential for l » a (such as a 1-cm-scale crystal with ~ 10 atoms along 
each direction). Then, according to Eq. (210), the boundary conditions impose, on the quasi-momentum 
q, exactly the same quantization condition as we had for k for a free ID motion. Hence, instead of Eq. 
(1.94) we can write 


dN = 


with the corresponding change of the summation rule 


/(<?) ~ ^ 



-dq 


-\f(q)dk. 


(2.231) 


(2.232) 


Hence, the density of states in ID q-space, dN/dq = H2n, does not depend on the potential profile 
at all! Note, however, that the profile does affect the density of states on the energy axis, dN/dE. As an 
extreme example, on the bottom and at the top of each energy band we have dE/dq — > 0, and hence 


dN dN ,dE I , dE 
= / = / >oo. 

dE dq dq 2n dq 


(2.233) 


This divergence (which survives in higher spatial dimensionalities as well) of the state density has 
important implications for the operation of several electron and optical devices, in particular 
semiconductor lasers. 


2.8. Effective mass and the Bloch oscillations 

The band structure of the energy spectrum has profound implications not only on the density of 
states, but also on the dynamics of particles in periodic potentials. In order to see that, let us consider the 
simplest case: motion of a wave packet consisting of Bloch functions (210), all in the same (say, « th ) 
energy band. Similarly to Eq. (27) for the a free particle, we can describe such a packet as 

W(x,t) = \a q u q {x)e^ qX ~ ( °^dq, (2.234) 


where the ^-periodic functions ii(x), defined by Eq. (208), are now indexed to emphasize their 
dependence on the quasi-momentum, and co{q) = E n (q)/h is the function of q describing the shape of the 
corresponding energy band - see, e.g., Fig. 26b or Fig. 28. If the packet is narrow, i.e. the width Sq of 
the distribution a q is much smaller than all the characteristic scales of the dispersion relation oAq), in 
particular nta, we may simplify Eq. (234) exactly as we have done in Sec. 2 for a free particle, despite 
the presence of factors u q {x) under the integral. In the linear approximation of the Taylor expansion, we 
again get Eq. (32), but now with 67 


V gr = 


dco 

dq 


5 


, co 
and v = — 

q 


< 1 =% ’ 


(2.235) 


67 A generalization of this expression to the case of essential interband transitions is not difficult using the 
Heisenberg picture of quantum mechanics (which will be discussed in Chapter 4 of this course) - see, e.g., Sec. 55 
in E. M. Lifshitz and L. P. Pitaevskii, Statistical Physics, Part 2, Pergamon,1980. 
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where qo is the central point of the quasi-momentum distribution. Despite the formal similarity with Eq. 
(33) for the free particle, this result is much more eventful; for example, as evident from the dispersion 
relation’s topology (see Figs. 26b, 28), the group velocity vanishes not only at q = 0, but at all values of 
q that are multiples of {re! a), at the bottom and on the top of each energy band. At these points, packet’s 
envelope does not move in either direction - though may keep spreading. 68 

Even more fascinating phenomena take place if a particle in the periodic potential is the subject 
of an additional external force F(t). (For electrons in a crystal lattice, this may be, for example, the 
Lorentz force of the applied electric and/or magnetic field.) Let the force be relatively weak, so that 
product Fa (i.e. the scale of energy increment from the additional force per one lattice period) is much 
smaller than the relevant energy scales the dispersion relation E{q) - see Fig. 26b: 

Fa«AE n ,A n . (2.236) 


This relation allows one to neglect the force-induced interband transitions, so that the wave packet (234) 
includes the Bloch eigenfunctions belonging to only one (initial) energy band at all times. For the time 
evolution of its center qo, theory yields 69 an extremely simple equation of motion 


n 


(2.237) 


This equation is physically very transparent: it is essentially the 2 nd Newton law for the time evolution 
of the quasi-momentum hq under the effect of the additional force F(t) only, excluding the periodic 
force -dU(x)/dx of the background potential U(x). This is very natural, because hq is essentially the 
particle’s momentum averaged over potential’s period, and the periodic force effect drops out at such an 
averaging. 


Time 
evolution 
of quasi- 
momentum 


Despite the simplicity of Eq. (237), the results of its solution may be highly nontrivial. First, let 
us use Eqs. (235) and (237) find the instant group acceleration of the particle (i.e. the acceleration of its 
wave packet’s envelope): 

_ dv gr _ d dco(q 0 ) _ d dco(q 0 )dq 0 _ d 2 co(q 0 ) dq 0 _ 1 d 2 co 
sr dt dt dq 0 dq 0 dq 0 dt dql dt h dq 2 


q=q Q F(t) . (2.238) 


This means that the second derivative of the dispersion relation plays the role of the effective reciprocal 
mass of the particle: 


h h 2 

et d 2 a?/dq 2 d 2 E!dq 2 


(2.239) 


Effective 

mass 


For the particular case of a free particle, described by Eq. (216), this expression is reduced to the 
original (and constant) mass m, but generally the effective mass depends on the wave packet’s 
momentum. According to Eq. (239), at the bottom of any energy band, m e f is always positive, but 
depends on the strength of particle’s interaction with the periodic potential. In particular, according to 
Eq. (206), in the tight binding limit, the effective mass is very large: 


68 For a Gaussian packet, the spreading is described by Eq. (39), with the replacement k — > q\ it is curious that at 
the inflection points with d 2 coldq 1 = 0 (which are present in each energy band) the packet does not spread. 

69 The proof of Eq. (237) is not difficult, but becomes more compact in the bra-ket formalism, to be discussed in 
Chapters 4 and 5. This is why I recommend the proof to the reader as an exercise after reading those two chapters. 
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l Wef \q=(n! a)n 


n 2 e w 

7 = 

28 n a 2 x 2 S n 


» m . 


(2.240) 


On the contrary, in the weak potential limit, the effective mass is close to m at most points of each 
energy band, but at the edges of the (narrow) bandgaps it is much smaller. Indeed, expanding Eq. (224) 
in the Taylor series near point q = q m , we get 




±U J±- 


f dE, V ~ 


2U 


n V 


dq 


r=±u n ± 




2\ U. 


(2.241) 


where J3 and q are defined by Eq. (225), so that 


\m 


hr 




ef \q=q„ 


' U "'j8 2 m 2E (n) 


« m 


(2.242) 


The effective mass effects in real solids may be very significant. For example, the charge carriers 
in the ubiquitous field-effect transistors of silicon integrated circuits have m e f « 0.19 m e in the lowest 
nonnally-empty energy band (traditionally called the conduction band), and m e f « 0.98 m e in the lower, 
nonnally-filled valence band. In some semiconducting compounds the conduction-band electron mass 
may be even smaller - down to 0.0145 m e in InSb! 

However, the absolute value of the effective mass in not the most surprising effect. The more 
shocking corollary of Eq. (239) is that on the top of each energy band the effective mass is negative - 
please revisit Figs. 26, 28, and 29 again. This means that the particle (or more strictly its wave packet’s 
envelope) is accelerated in the direction opposite to the force. This is exactly what electronic engineers, 
working with electrons in semiconductors, call holes, characterizing them by positive mass and positive 
charge. If the particle does not leave a close vicinity of the energy band’s top (say, due to scattering 
effects), such flip of signs does not lead to an error, because the Lorentz force is proportional to 
electron’s charge ( q = -e), so that particle’s acceleration a gr is proportional to ratio (q/m e f). 70 

However, at some phenomena the usual image of a hole as a particle with q > 0 and m e f > 0 is 
unacceptable. For example, let us fonn a narrow wave packet at the bottom of the lowest energy band, 71 
and then exert on it a constant force F > 0 - say, due to a constant external electric field directed along 
axis x. According to Eq. (237), this would lead to a linear growth of qo in time, so that in the quasi- 
momentum space, the packet’s center would slide, with constant speed, along the q axis - see Fig. 32a. 
Close to the energy band bottom, this motion would correspond to a positive effective mass (possibly, 
somewhat larger than the genuine particle’s mass m), and hence be close to free particle’s acceleration. 
However, as soon as qo has reached the inflection point, where d E\/dq = 0, the effective mass, and 
hence acceleration (238) change signs to negative, i.e. the packet starts to slow down (in the direct space 


70 The language is which the hole has a positive charge and mass has an additional convenience for states on the 
top of the valence band whose single-particle states are normally filled. Then the simplest, single -particle 
excitation of this multi-particle ground state may be created by giving one electron enough energy to lift it to a 
reference (e.g., Fermi-energy) level E F that is, by definition of the valence band, is higher than all values E.(q). 
Then it is natural to prescribe to the excitation a positive mass m e f , because the energy A E = E P - E_(q) necessary 
for the excitation grows with the deviation of q from q m . 

71 Intuition tells us (and statistical physics duly confirm s :-) that this may be readily done, for example, by weakly 
coupling the system to a low-temperature environment, and letting it to relax to the lowest possible energy. 
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x) while still moving ahead in the quasi-momentum space. Finally, at the energy band’s top the particle 
stops at certain x max , while continuing to move in the g-space. 




Fig. 2.32. The Bloch oscillations (red lines) and the Landau-Zener tunneling (blue arrows) within: 
(a) the time-domain picture, and (b) the energy-domain picture. On panel (b), the tilted gray strips 
show the allowed energy bands, and the bold red lines, the Wannier-Stark ladder. 


Now we have two alternative ways to look at the further time evolution of the wave packet. 
From the extended zone picture (which is the simplest for this analysis, see Fig. 32a), 72 we may say that 
the particle crosses the 1 st Brillouin zone boundary and starts going forward in q, i.e. down the lowest 
energy band. According to Eq. (235), this region (up to the next inflection point) corresponds to a 
negative group velocity. After q 0 has reached the next minimum of the energy band at qa = 2 n, the 
whole process repeats again (and again, and again). 

These are the famous Bloch oscillations - the effect that was predicted (by the same F. Bloch) as 
early as in 1929, but evaded experimental observation until the 1980s - see below. Their time period 
may be readily found from Eq. (237): 

A q 2 n/a 

At B = = , (2.243) 

dq! dt F !ti Fa 


so that the Bloch oscillation frequency 




In _ Fa 
A t B h 


(2.244) 


The direct-space motion of the wave packet’s center x 0 (t) during the Bloch oscillation process 
may be analyzed by integrating Eq. (235) over some time interval At: 


72 This phenomenon may be also discussed from the point of view of the reduced zone picture, but then it 
requires the introduction of instant jumps between the Brillouin zone boundary points (see the dashed red line in 
Fig. 32) that correspond to physically equivalent states of the particle. Evidently, this language is more artificial. 
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dq 
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h_ 

F 


(2.245) 


If interval At is equal to the Bloch oscillation period A/b (234), the initial and final moments of E(qd) = 
fiodqo) are equal, giving Axo = 0: in the end of the period, the wave packet returns to its initial position. 
However, if we carry this integration only from the smallest to the largest values of oicjo), i.e. the points 
where the group velocity vanishes, we get the oscillation swing 

Bloch 
oscillations: 
spatial 
swing 

This simple result may interpreted using an alternative energy diagram (Fig. 32b) that results 
from the following arguments. The additional force F may be described not only via the 2 nd Newton law 
version (237), but, alternatively, by its contribution U F = - Fx to the total (“Gibbs” 73 ) potential energy 

U 1 (x) = U(x)- Fx (2.247) 

of the system. The direct solution of the Schrodinger equation (61) with such potential may be hard to 
find, but if the force is weak in the sense of Eq. (236), as we are assuming now, one can argue that our 
quantum-mechanical treatment including the periodic potential U(x ) should be still correct, if the second 
tenn in Eq. (247) is considered as a constant at the wave packet width scale Sx , but dependent on 
position xo of the packet’s center. In this approximation, the total energy of the wave packet may be 
found as 


a -* 1 ( \_ AE i 

^max , - (®max ®min ) 


F 


F 


(2.246) 


Ex =E(q 0 )-Fx 0 


(2.248) 


In a plot of such energy as a function of xo (Fig. 32b), the information on energy dependence on 
q 0 is lost, but we already know it is rather uneventful, and well characterized by the position of band-gap 
edges on the energy axis. 74 In this representation, the Bloch oscillations of a relatively wide (Sx » a) 
wave packet should keep the full energy Ej, constant, i.e. follow a horizontal line in Fig. 32b, limited by 
the classical turning points corresponding to the bottom and the top of the allowed energy band. The 
distance Ax max between these point is evidently given by Eq. (246). 

Besides this second look at the oscillation swing result, the total energy diagram shown in Fig. 
32b enables one more remarkable result. Let a wave packet be so narrow in the momentum space ( Sq — > 
0) that 1 lq » Ax max ; then the horizontal line segment in Fig. 32b presents the spatial extension of the 
eigenfunction of the Schrodinger equation with potential (247). But this equation is evidently invariant 
with respect to the following simultaneous translation in coordinate and energy: 


x — ^ x ~\~ a, E — ^ E — Fci . 


(2.249) 


Wannier- 

Stark 

ladder 


This means that it is satisfied with an infinite set of similar solutions, each corresponding to one of the 
horizontal red lines shown in Fig. 32b. This is the famous Wannier-Stark ladder, with the step height 


AE S = Fa . 


(2.250) 


73 See, e.g., CM Sec. 1.5. 

74 In semiconductor device physics and engineering, such plots are called the band edge diagrams, and are the 
virtually unavoidable components of any discussion or publication. 
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The importance of this alternative representation of the Bloch oscillations is due to the following 
fact. In most experimental realizations, the power of radiation at frequency (244), that may be extracted 
from the oscillations by their electromagnetic coupling to an external detector, is very low, so that their 
direct detection presents a hard problem. 75 However, let us apply to a Bloch oscillator an additional rf 
field at frequency co ~ 0X\- As these frequencies are brought close together, the external signal should 
synchronize (“phase lock”) Bloch oscillations, 76 resulting in certain observable changes - for example, a 
resonant absorption of the external radiation. Now let us notice that Eqs. (244) and (250) yield the 
following remarkable relation: 

AE s =hco B . (2.251) 


This means that the resonant phenomena at co ~ (Oq allow for an alternative (but equivalent) 
interpretation - as the result of rf-induced transitions 77 between the steps of the Wannier-Stark ladder! 
(Such occasions when two very different languages may be used for the interpretation of the same 
phenomenon is one of the most beautiful features of physics.) 

This effect has been used for the first experimental confirmation of the Bloch oscillation theory. 
For this purpose, the natural periodic structures, solid state crystals, are inconvenient due to their very 
small period a ~ 10' 10 m. Indeed, according to Eq. (244), such structures require very high forces F (and 
hence high electric fields 3 = Fie ) to bring coq to an experimentally convenient range. This problem has 
been overcome by fabricating artificial periodic structures ( superlattices ) of certain semiconductor 
compounds, such as Gai_ x Al Y As with various degrees x of gallium to aluminum atom replacement, 
whose layers may be grown over each other epitaxially, i.e., without very few crystal structure 
violations. These superlattices, with periods a ~ 10 nm, has allowed a clear observation of resonant 
effects at co ~ (Oq, and hence the measurement of the Bloch oscillation frequency, in particular its 
proportionality to the applied dc electric field, predicted by Eq. (244). 78 

Very soon after this observation, the Bloch oscillations have been observed in small Josephson 
junctions. 79 Since this experiment involved two important conceptual issues, let me discuss it in a little 
bit more detail. As was discussed in Sec. 2.3, the Josephson junction dynamics may be reasonably well 
described by two simple equations (54) and (55). They may be combined to calculate the work of an 
external voltage source at Josephson phase change between arbitrary initial {(p m \) and final (V/y n ) values, 
as the integral of its power IV over the time interval At of the change: 


At 


At 


work = f IVdt = [ (l c sin <pf ^ ^ dt = H1jl - [smcpdcp = --^-(cos^ -cos^ ini ). (2.252) 
J J \ 2 e dt 


hi 


A in 


hi. 


2e 


A 


2e 


We see that the work depends only on the initial and final values of cp (but not on the law phase 
evolution in time), i.e. may be presented as the difference U((pw n ) - U((p m i), where function 


75 In systems with many independent particles (such as semiconductors), the detection problem is exacerbated by 
phase incoherence of the Bloch oscillations performed by each particle. This drawback is absent in atomic Bose- 
Einstein condensates whose Bloch oscillations (in a periodic potential created by standing optical waves) were 
eventually observed by M. Ben Dahan et al., Phys. Rev. Lett. 76, 4508 (1996). 

76 A simple analysis of phase locking of a classical oscillator may be found, e.g., in CM Sec. 4.4. 

77 A qualitative theory of such transitions will be discussed in Sec. 6.6 and then in Chapter 7. 

78 E. Mendez et al., Phys. Lev. Lett, 60, 2426 (1988). 

79 L. Kuzmin and D. Haviland, Phys. Rev. Lett. 67, 2890 (1991). 
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U(<p) = -Ej cos <p + const, 


■ , ^ M c 

with E, = , 

2e 


(2.250) 


may be interpreted as the potential energy of the junction (if we consider the Josephson phase as a 
generalized coordinate). This energy apart, the Josephson junction, as a system of two close, nearly 
isolated (superconductors, has a certain capacitance C and the associated electrostatic energy Ec = 
CV 12. Using Eq. (54) again, we may present it as 


E 
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C 

~2 


V 2 =C 


(flf 
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( dqf 

U ej 


K dt j 


(2.251) 


This means that from the point of view at phase ^asa generalized coordinate, E ( - should be considered 
the kinetic energy of the system, whose dependence on the generalized velocity d(pldt is similar to that 
of a ID mechanical particle, with an effective mass 80 


f 

m, = C 
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h_ 

2e 


y 


(2.252) 


Hence the total energy of the junction, Ec + U(tp), is formally similar to that of a ID nonrelativistic 
particle in the sinusoidal potential with the <p-axis period aj = 2n. 

However, before using the results of the ID band theory to this system, we have to resolve one 
paradox (that was the subject of a lively discussion just about 30 years ago). When we develop the band 
theory, we imply that its translation by period a is (in principle) measurable, i.e. particle positions x and 
(x + a ) are distinguishable - otherwise Eq. (193) with q ^ 0 would not have much sense. For a 
mechanical particle this assumption is very plausible, but less so for a Josephson junction. Indeed, for 
example, if we change cp by aj = 2n via changing the phase of one of superconductors, say (p\ (Fig. 3) 
by 2 n, then its wavefunction becomes I y/ \ exp { /( + 2n)\ = I i// cxpj/Vpi }, and it is not immediately 
clear whether these two states may be distinguished. In order to resolve this contradiction, it is sufficient 
to have a look at Eq. (54). It shows that if q> changes in time by 2 n (say, by a fast ramp-up), voltage V 
across the junction exhibits a pulse with “area” 


. 

\ v mt=E\ 

d -Ei, = E 

dt 2e ^ 

' dtp = — 2n = — *2xl0~ 15 V-s. 
2e 2e 


Such single-flux-quantum (SFQ) pulses 81 not only may be measured experimentally, but even have been 
used for signaling and ultrafast (sub-THz) computation, to the best of my knowledge still keeping the 
absolute records for the highest speed and smallest energy consumption at computation. 82 

Hence, the 2 ^--shifts of phase cp are measurable, and in the absence of dissipation the Josephson 
junction dynamics is indeed similar to that of a ID particle in a periodic (sinusoidal) potential, and its 
energy spectrum forms energy bands and gaps described by the Mathieu equation - see Fig. 31. 
Experimentally, the easiest way to verify this picture is to measure the corresponding Bloch oscillations 


80 Of course, the dimensionality of m ef so defined is different from kg. 

81 This term has originated from the fact that the right-hand part of Eq. (253) equals to the single quantum unit 
(®o) of the magnetic flux in superconductors - see Sec. 3.1 below. 

82 See, e.g., P. Bunyk et al., Int. J. on High Speed Electronics and Systems 11 , 257 (2001). 
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induced by an external current I ex (t). In order to find the frequency of these oscillations, it is sufficient to 
replace Eq. (237), which expresses the 2 nd Newton law averaged over period a of potential U(x), with 
the charge balance equation 


dQ 

dt 



(2.254) 


where Q is the “quasi-charge” 83 , i.e. the electric charge of the capacitor averaged over the period 2n of 
the periodic potential U((p). (Notice that at such averaging, current (55) is averaged out from the 
equation, so that is affects the phenomena “only” via its contribution to the energy band structure.) 


Since the Josephson-j unction analog of the genuine wave number k = m(dxldt)lfi of a particle is 


k j 


m i dcp _ in, 2e ^ _ CV 
ti dt ti fi 2e 


(2.255) 


and CV is the genuine charge on the capacitor, the analog of q (the quasi-momentum divided by fi) may 
be obtained just by the replacement of that product with quasi-charge Q\ 

q,=^~- (2.256) 

2e 

Comparing this expression with Eq. (254), we see that q } obeys the following equation of motion: 


d <h _ 7 ex(0 

dt 2e 


(2.257) 


so that the role of force F is now played by F } = till 2e. Hence if / ex (0 = const =/ , we can use Eq. (244) 
with that replacement, and also a — > aj = 2 re, to get 


_ 1 F } a } _ I 
2 n 2 n ti 2e 


(2.258) 


This very simple result has the following physical sense. 84 In the quantum operation mode, the 
junction is recharged by the external current, following Eq. (256), until its electric charge reaches e (i.e. 
q\ci\ = {QI2e)2n reaches n- see Fig. 32a); then one Cooper pair passes through the junction changing its 
charge to e - (2e) = -e, with the same charging energy (251) - the process analogous to crossing the 
border of the 1 st Brillouin zone; then the process repeats again and again. It is remarkable that Eq. (258), 
describing the frequency of such quantum property of the Josephson phase (p as its Bloch oscillations, 
does not include the Planck constant, while Eq. (56), describing the classical motion of (p, does. 


In this context, one may wonder which of these two types of oscillations would a dc-biased 
Josephson junction generate. For the dissipation-free junction, the answer is: the Bloch oscillations 
(258) with frequency proportional to dc current. However, any practical junction has some energy losses 
that may be (approximately) described by a certain Ohmic conductance G connected in parallel to the 


83 Eq. (254) tells us that quasi-charge Q has the simple physical sense of the external electric charge being 
inserted into the junction by the external current 7 ex - just like the physical sense of quasi-momentum Tiq of a 
mechanical particle, according to Eq. (237), is the contribution to particle’s momentum by the external force F. 

84 D. Averin et al., Sov. Phys. - JETP 61, 407 (1985). 
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Quantum 
unit of 
resistance 


junction. Very luckily for Dr. Josephson and his Nobel Prize, it is much easier to fabricate and test 
junctions with G » 1/ Rq, where Rq is the so-called quantum unit of resistance 


R s ^L*6.45kQ, 
e 2e 2 


(2.259) 


the fundamental constant that jumps out at analysis of several other effects as well - see, e.g., Sec. 3.2. 
As will be discussed in Chapter 7, such high energy losses provide what is called dephasing - the 
suppression of the quantum coherence between different quantum states of the system - in our current 
case, between the wavefunctions u( (p - 2nj) localized at different minima of the periodic potential U(tp), 
and thus make the dynamics of the Josephson phase (p virtually classical, obeying equations (54) and 
(55). As we have seen in Sec. 2, dc biasing of such a junction leads to Josephson oscillations with 
frequency (56) proportional to the applied dc voltage. 


2.9. Landau-Zener tunneling 

All the Bloch oscillation discussion in the last section was based on the premise that the particle 
stays within one (say, the lowest) energy band. However, just a single look at Fig. 32 shows that this 
assumption becomes unrealistic if the energy gap separating this band from the next one becomes very 
small, Ai — » 0. Indeed, in the weak potential approximation, that is adequate in this limit, at I U\\ — > 0, 
the two dispersion curve branches (216) cross without any interaction, so that if our particle (the wave 
packet) is driven to approach that point, it should continue to move up in energy - see the dashed blue 
arrow in Fig. 32a. Similarly, in the “energy-domain” presentation shown in Fig. 32b, it is intuitively 
clear that at Ai — > 0, the particle residing at one of the steps of the Wannier-Stark ladder should able to 
somehow overcome the vanishing spatial gap Axo = Ai IF and to leak into the next band - see the 
horizontal dashed blue arrow on that panel. 


This process, called the Landau-Zener (or “interband”, or “band-to-band”) tunneling 85 is indeed 
possible. In order to analyze it, let us first take F = 0, and consider what happens if a quantum particle 
described by an x-long (i.e. A- narrow) wave packet is incident from the free space upon a periodic 
structure of a large but finite length / » a. If packet’s energy E is within one of the energy bands, it 
may evidently propagate through the structure (though may be partly reflected from its front end). The 
corresponding quasi-momentum may be found by solving the dispersion relation for q; for example, in 
the weak-potential limit, Eq. (224), which is valid near the gap, yields 


q = q m + q. 


r 


£ 2 -AI 


1/2 


where E = E ± -E in) , 


(2.260) 


and yis given by the second of Eqs. (225). 

Now, if energy E corresponds to one of the energy gaps A,„ the propagation is impossible, so that 
the packet is completely reflected back. However, our analysis of the potential step problem in Sec. 3 
implies that the wavefunction would still have an exponential tail protruding into the periodic structure 
and decaying on some length 8 - see Eq. (67). Indeed, a review of the calculation leading to Eq. (260) 


85 It was predicted independently by L. D. Landau, Phys. Z. Sowjetunion 2 , 46 (1932) and C. Zener, Proc. R. Soc. 
London A 137 , 696 (1932). 
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shows that they remain valid within the gap as well, if the quasi-momentum is understood as a purely 
imaginary number: 


q — » ±iic. 


where k = 




for E 2 <1 U n 


(2.261) 


With such contribution, the Bloch solution (193b) indeed describes an exponential decay of the 
wavefunction at length 8= 1 he. 


Now returning to the effects of weak force F in the energy-domain approach, presented by Eq. 
(248) and illustrated in Fig. 32b, we may recast Eq. (261) as 


K 


k{x) = -\u n | 2 -(Fx ) 2 

r 


1/2 


(2.262) 


where x is particle’s (i.e. wave packet center’s) deviation from the mid-gap point. Thus the gap has 
created a potential barrier of a finite width Axo = 2F/I U n \ , through which the wave packet may tunnel 
with a finite probability. As we already know, in the WKB approximation (in our case requiring kAxo 
» 1) this probability is just the tunnel barrier’s transparency T, which may be calculated from Eq. 
(117): 


In E = 2 JV(x)c/x = - J|c/„| 2 -{Fxf\ 2 dx =i^2xj(l-£ 2 ) 1/2 ^. (2.263) 

/t(.t) 2 >o y -x c y 


where ±x c = ±Axo/2 = ±F! \ U„ \ are the classical turning points. Working out this simple integral (which 
may be viewed upon as the quarter of the unit circle’s area, and hence equal to tz/4), we get 


T = exp 



(2.264) 


This famous result was obtained by Landau and Zener in a more complex way, whose advantage 
is a constructive proof that Eq. (264) is valid for arbitrary relation between yF and I U„\ , i.e. arbitrary T, 
while our simple derivation was limited to the WKB approximation, i.e. to T « l. 86 


Returning to Eq. (225) and (237), we can rewrite the product yF participating in Eq. (264) as 
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fc dq 0 _ h 

d(E,-E r ) 

hu 
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E[=E r =E n 
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E l =E r =E n ^ 


(2.265) 


where u has the meaning of the “speed” of the energy level crossing in the absence of the gap. Hence, 
Eq. (264) may be presented in a form 


T = exp 


2 ttU„ 


fiu 


(2.266) 


86 Note that Eq. (264) is still limited to the hyperbolic dispersion relation, i.e. (in the band theory) to the weak 
potential limit. In the opposite, tight-binding limit, the interband tunneling may be treated as an excitation of the 
upper band states by sinusoidal Bloch oscillations, and is completely suppressed at ^cob < Ai. 
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that is more physically transparent. 87 Indeed, the fraction l\ U n I lu = A „u gives the time scale At of 
energy’s crossing the gap region, and according to the Fourier transfonn, its reciprocal, /utax ~ 1/At 
gives the upper cutoff of frequencies involved in the Bloch oscillation process. Flence Eq. (266) means 
that 


-lnr* 


A, 


flO)„ 


(2.267) 


This formula allows us to interpret the Landau-Zener tunneling as for system’s excitation across the 
energy gap A n , by the maximum energy quantum h co max available from the Bloch oscillation process. 

The interband tunneling is an important ingredient of several physical phenomena and even some 
practical devices, for example the tunneling (or “Esaki”) diodes. This simple device is just a junction of 
two semiconductor electrodes, one of them is so strongly //-doped by electron donors that the additional 
electrons fonn a degenerate Fermi gas at the bottom of the conduction band. Similarly, the opposite 
electrode is //-doped so strongly that the Fermi level of electrons in the valence band is lowered below 
the band edge (Fig. 33). 



Fig. 2.33. Tunneling diode: (a) the band edge diagram of the device at zero bias; (b) the same diagram at 
modest positive bias eV ~ A/2, and (c) the /-Ecurve (schematically). Dashed lines show the Fermi level 
positions. 


At thennal equilibrium, and in the absence of external voltage bias, the Fenni levels self-align, 88 
leading to the build-up of the contact potential difference </>le, with tf> somewhat larger than the energy 
bandgap A - see Fig. 33a. This potential difference creates an internal electric field that tilts the energy 
bands (just as the external field did in Fig. 32b), and leads to the formation of the so-called deletion 
layer in which the Fermi level located is within the energy gap and hence there are no charge carriers 
ready to move. In usual p-n junctions, this layer is broad and prevents any current at applied voltages V 
lower than Ale . In contrast, in a tunneling diode the depletion layer is so thin (below ~10 ntn) that the 
interband tunneling is possible and provides a substantial Ohmic current at small applied voltages - see 
Fig. 33c. 

However, at substantial positive bias, eV ~ A/2, the conduction band become aligned with the 
middle of the gap in the //-doped electrode, and electrons cannot tunnel there. Similarly, these are no 


87 In Chapter 6, Eq. (266) will be derived using a different method based on the Golden Rule of quantum 
mechanics. 

88 See, e.g., SM Secs. 1.5 and 6.4. 
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electrons in the /7-doped semiconductor to tunnel into the available states just above the Fermi level in 
the /7-doped electrode - see Fig. 33b. As a result, current drops significantly, to grow again only when 
eV exceeds ~A and allows the electron motion through the within each energy band. Thus the tunnel 
junction’s I-V curve has a part with negative differential resistance (dVldl < 0). This effect may be used 
for the amplification of analog signals, including self-excitation of electrical oscillators (i.e. rf signal 
generation), 89 and signal swing restoration in digital electronics. 


2.10, Flarmonic oscillator: A brute force approach 


To complete our review of ID systems, we have to consider the famous harmonic oscillator, i.e. 
a ID particle moving in the quadratic-parabolic potential (111). This is just a smooth quantum well 
providing “soft” confinement, whose discrete spectrum we have already found in the WKB 
approximation - see Eq. (114). Let us try to solve the same problem exactly - not because there is 
anything conceptually interesting in it (there is not :-), but because of its enormous importance for 
applications. For that, let us write the stationary Schrodinger equation for potential (111): 


h~ d~y/ 

^- + 

2m dx 


mco n 


-x \j/ = Ey/ . 


(2.268) 


From the solution of Exercise Problem 1.5, the reader already kn ows 90 one of the eigenfunctions of this 
equation, 
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Expression (269) shows that the characteristic scale of wavefunction’s spatial spread 91 is equal to 


x 0 = 

r n ^ 

{ma> 0 j 

1/2 

Wave- 
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yL.Lt 1/ spread 
scale 


Due to the importance of this scale, let us give its crude estimates for several typical systems: 

(i) Electrons in solids and fluids: m « 10°° kg, coo ~ 10 15 s' 1 , giving xo ~ 0.3 nm, comparable 
with inter-atomic distances a. As a result, classical mechanics is not valid at all for the analysis of their 
motion. 


0/1 O A 10 1 

(ii) Atoms in solids: m ~ 10'“ -10' kg, a>o ~ 10 s' , giving x 0 ~ 0.01 - 0.1 nm, i.e. from ~a few 
percent to a few tens percent of a. Because of that, methods based classical mechanics (e.g., molecular 
dynamics) are approximately valid for the analysis of atomic motion, though may miss some fine effects 


89 See, e.g., CM Sec. 4.4. 

90 If not yet, I am inviting him or her to check this fact now by the direct substitution of solution (269) into the 
differential equation (268), simultaneously proving Eq. (270). 

91 Quantitatively, as was already mentioned in Sec. 2.1, x 0 = 'lldx = (2x 2 ) 1/2 . 
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of motion of lighter atoms - e.g., quantum tunneling of hydrogen atoms through energy barriers of the 
potential profile created by its neighbors. 

(iii) Probe masses in modern gravity-wave detectors (Advanced LIGO, VIRGO, KAGRA, 
etc.): 92 m ~10 2 kg, ox> ~ 10 2 s' 1 , giving xo ~ 10' l9 m. After several decades of development, the sensitivity 
of these instruments is still limited by various noise sources at the level of the order of 10' 18 m. 93 Thus 
the prospects of observing quantum-mechanical effects in such installations do not look very realistic. 

Returning to the Schrodinger equation (268), let us recast it into a dimensionless form by 
introducing dimensionless variable <^ = x/xo. This gives 

-4jr + ^V = W , (2.272) 

where s= lE/tioX) = E/Eq. In this notation, the ground state wavefunction is proportional to cxp{-c 2 /2 j, 
so that let us look for the solutions to Eq. (272) in the form 

^ = Cexp j-^jtf(£), (2.273) 

where /1(g) is a new function. With this substitution, Eq. (272) yields 

^-2^ + (s-\)H = Q. (2.274) 

dg dg 

It is evident that H = const and a = 1 is one of its solutions, describing the eigenstate (269) with 
energy (270), but what are the other eigenstates and eigenvalues? This equation has been studied in 
detail in the mid- 1800s by C. Hermite who has shown that ah eigenvalues are given by equation 

s n -l = 2n, with n = 0,1,2,..., (2.275) 

so that our WKB result (114) is indeed exact for any n, and Eqs. (269) and (270) describe the ground- 
state of the oscillator. The eigenfunction corresponding to eigenvalue s n is a polynomial (now called the 
Hermite polynomial) of degree n, that may be most conveniently calculated using the following explicit 
formula: 


(2.276) 


It is easy to use this formula to calculate several lowest-degree polynomials - see Fig. 34a: 

H 0 = 1, H x = 2£ H 2 = 4^ 2 - 2, H 3 = 8£ 3 -12£, H 4 = 16£ 4 - 48<f + 12, ... (2.277) 

The most important properties of the polynomials are as follows: 

(i) their “parity” (symmetry-antisymmetry) alternates with number n, 

(ii) H n (g) crosses the £-axis exactly n times (has n zeros), and 


92 See, e.g., http://www.ligo.caltech.edu/, and a recent update by T. Feder, Phys. Today 68, No. 9, 20 (2015). 

93 According to the recent announcement by B. Abbott et al., Phys. Rev. Lett. 116, 061102 (2016), this sensitivity 
was sufficient for the first direct detection of gravitational waves emitted at a merger of two black holes. 
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(iii) the polynomials are mutually orthonormal in the following sense: 

+oo 


J +£ = n ln Vn\S, lr 


(2.278) 


Using Eq. (273) to translate this result to functions y/ n [x), we get the following orthonormal 
eigenfunctions of the harmonic oscillator (Fig. 34b): 94 


Wn to 


1 


M 


} 12 1/4 1/2 

n x„ 


expi 



r \ 
x 

V x o 


{2.219) 


Harmonic 

oscillator’s 

eigen- 

functions 



Fig. 2.34. (a) A few lowest Flermite 
polynomials and (b) the corresponding 
eigenenergies (dashed lines) and 
eigenfunctions (solid lines) of the 
harmonic oscillator. The black dashed 
line shows the potential profile U(x), 
drawn on the same scale as energies E n , 
so that the line crossings with the energy 
levels correspond to the classical turning 
points. 


Besides its own importance, this is a typical example of eigenstates of particle confined in a soft- 
wall quantum well. It is very instructive to compare them with eigenstates of a the rectangular quantum 
well, with its ultimately-hard walls - see Eq. (1.76) and Fig. 1.7. Let us list their similar features: 


94 These stationary states of the harmonic oscillator are sometimes called its Fock states, to distinguish them from 
other fundamental solutions (such as Glauber states) which will be discussed in Sec. 5.5 and beyond.. 
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(i) Wavefunctions oscillate in the classically-allowed regions with E n > U(x), while 
dropping exponentially beyond the boundaries of that region. 

(ii) Each step up the energy level ladder increases the number of the oscillation half- 
waves (and hence the number of its zeros), by one. 95 

Here are the major features specific for the soft confinement: 

(i) The spatial spread of the wavefunction grows with n, following the gradual increase of 
the classically allowed region. 

(ii) Correspondingly, E n exhibits a slower growth than the E„ oc n~ law given by Eq. 
(1.77), because of the gradual reduction of confinement, which moderates the growth of kinetic energy. 

Unfortunately, the brute-force approach to the harmonic oscillator problem, discussed above, is 
not too appealing intellectually. First, the proof of Eq. (276) is rather longish. More importantly, it is 
hard to use Eq. (279) for calculation of the so-called matrix elements of the system - as we will see in 
Chapter 4, virtually the only numbers important for applications. Finally, it is also almost evident that 
there should be some straightforward math leading to any formula as simple as Eq. (114) for E n . Indeed, 
there is a much more efficient, operator-based approach to this problem; it will be described in Sec. 5.4. 


2.11. Exercise problems 

2.1 . The initial wave packet of a free ID particle is described by Eq. (2.20) of the lecture notes: 

v F(x,0) = J a k e ikx dk . 

(i) Obtain a compact expression for the expectation value ip) of particle's momentum. Does (p) 
depend on time? 

(ii) Calculate ip) for the case when function \cik\ is symmetric with respect to some value ko. 

2.2 . Calculate the function a*, defined by Eq. (2.20), for the wave packet with a rectangular 
envelope: 

[ C exp{/k 0 v}, for-a/2<x<+a/2, 

} 0, otherwise. 

Analyze the result in the limit koa — > oo. 


2.3 . Prove Eq. (49) for the ID propagator of a free quantum particle, starting from Eq. (48). 

2.4 . Express the ID propagator, defined by Eq. (44), via eigenfunctions and eigenenergies of a 
particle moving in an arbitrary stationary potential U(x). (For the notation simplicity, assume that the 
energy spectrum of the system is discrete.) 


95 In mathematics, a slightly more general statement, valid for a broader class of ordinary linear differential 
equations, is frequently called the Sturm oscillation theorem, and is a part of the Sturm-Liouville theory’ of such 
equations - see, e.g., Chapter 10 in the handbook by G. Arfken et al. recommended in MA Sec. 16. 
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2.5 . Calculate the change of the wavefunction of a ID particle, resulting from a short pulse of an 
external force, which may be approximated by the delta- function: 96 

f(<)=-pdO- 

2.6 . * Analyze the effect of phase locking of Josephson oscillations on the dc current flowing 
through the junction, assuming that external microwave source applies a fixed sinusoidal ac voltage, 

V(t)-V = A cos cot , 

to a junction with sinusoidal current-phase relation (55), using Eq. (54) for time evolution of phase cp. 


2.1 . Calculate the transmission coefficient T as a function of particle’s energy E for the 
rectangular potential barrier, 

0, for x <-d / 2, 

U(x) = \U 0 , ior-d/2<x<+d/2, 

0, for<f/2<x, 


for the case E > Uo. Analyze and interpret the result, taking into account that Uo may be either positive 
or negative. (In the last case, we are speaking about particle’s passage over a rectangular potential well 
of finite depth.) 


2.8 . Looking at the lower (red) line in Fig. 1.7, it seems plausible that the ID ground-state 
function X(x) oc sin(7u7a) of the simple quantum well (1.69) may be well approximated by an inverted 
parabola: 

X mJ x ) = Cx ( a -x), 

where C is the normalization constant, and a = a x for brevity. Explore how good this approximation is. 97 


2.9 . Spell out the stationary wavefunctions of a harmonic oscillator in the WKB approximation, 

2 4 * 

and use them to calculate the expectation values ( x~ ) and (x ) for arbitrary state number n. 


2.10 .* A ID particle of mass m is placed into the following triangular quantum well: 98 


u (x ) 


f+oo, for x < 0, 
{ Fx, for x > 0, 


with F > 0 . 


(i) Calculate its energy spectrum using the WKB approximation. 

(ii) Estimate the ground state energy using the variational method. 

(iii) Calculate the three lowest energy levels, and also for the 10 th level, with at least 0.1% 
accuracy, from the exact solution of the problem. 

(iv) Compare and discuss the results. 

Hints'. 


96 The constant P is called the force’s impulse. (In higher dimensionalities, it is a vector - just as the force is.) 

97 Solving this problem is a good preparation to the use of the full variational method in the next two problems 
(and beyond). 

98 With F = mg, this is just the well-known bouncing ball problem. 
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- In Task (ii), try to incorporate a certain parameter A into your trial wavefunction, and then use 
its adjustment to minimize the expectation value of system’s Hamiltonian (mentioned in Chapter 1): 

+oo 

{ H ) trial = {^al^Vtrial dx , 

-00 

where the trial function is assumed to be properly normalized. The variational method is based on the 
easily provable" fact that this expectation value cannot be less than the genuine E g , coinciding with it 
only if the trial function exactly coincides with the genuine wavefunction y/ g of the ground state. Hence, 
the lower (H) ti j a i you reach, the better is your result. 

- The values of the first zeros of the Airy function, necessary for Task (iii), may be found in 
many math handbooks, for example, in Table 10.13 of the collection edited by Abramowitz and Stegun 
- see MA Sec. 16(i). 

2.1 1 . For a ID particle of mass in placed into a potential well with the following profile, 

t/(x) = ax 2s , with a > 0 and 5 > 0 , 

(i) calculate its energy spectrum using the WKB approximation, and 

(ii) estimate the ground state energy using the variational method. 

Compare the ground state energy results for parameter 5 equal to 1, 2, 3, and 100. 

2.12 . Prove Eq. (117) for the case 7 wkb « 1, using the connection fonnulas (104). 


2.13 . Use the WKB approximation to express the expectation value of the kinetic energy of a ID 
particle, confined in a soft potential well, in its n lb stationary state, via the derivative dEJdn, for n » 1. 


2.14 .* Use the WKB approximation to calculate the transparency T as a function of particle 
energy E, for the following triangular potential barrier: 


U(x) = 


| 0 , 

K ~Fx, 


for x < 0, 
for x > 0, 


with F, Uo > 0. 

Hint : Be careful treating the sharp potential step atx = 0. 


2.15 .* Prove that the symmetry of the scattering matrix elements describing an arbitrary time- 
independent scatterer allows its representation in the form (136a), with the additional restriction (136b). 


2.16 . Prove the universal relations between elements of the transfer matrix T of a stationary (but 
otherwise arbitrary) ID scatterer, which were mentioned in Sec. 5. 

2.17 . For a deep and narrow ID quantum well, modeled by a delta-function, 

U (x) = ~'t08{x), with 'W > 0 , (*) 


99 See, e.g., Sec. 8.2 below. 
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find the localized eigenfunction(s) y/ n (with i// n (x) — > 0 at x — > oo), and the corresponding value(s) E n . 


2.18 . A ID particle was localized in the delta-functional well, with U(x ) = such as the 

one analyzed in the previous problem. Then (say, at t = 0) the well’s bottom is suddenly lifted, so that 
the particle becomes free to move. Calculate the probability density, w(k) to find the particle in a state 
with wave number k at t > 0, and the final total energy of the system. 


2.19 . Calculate the lifetime of the metastable localized state of a ID particle in the potential 

U (x) = ~'U>S(x)~ Fx, with 'W > 0 , 

using the WKB approximation. Fonnulate the condition of validity of the result. 


2.20 . Analyze the localized eigenfunction(s) and the characteristic equation(s) for eigenenergies 
of a ID particle in the following two-well potential 


U(x) = -vf 


-w 

8 

( 

x 

+ £ 

f <7^1 
x -\ — 



V 2 ) 


V 2 JJ 


with 'W > 0 . 


Explore asymptotic behaviors of the eigenenergies in the limits of very strong and very weak potential, 
and find the number of localized states as a function of distance a. 


2.21 .* Consider a symmetric system of two quantum wells of the type shown in Fig. 23, but with 
17(0) = C/(±qo) = 0 - see Fig. on the right. What is the sign of well 
interaction force due to a quantum particle of mass m, shared by U(x) f 

them, for the cases when the particle is in: 


(i) a symmetric eigenstate, with y/ s {-x) = i// v (x)? 

(ii) an asymmetric eigenstate, with i//J-x) = - i//Jx)2 

Use a different approach to confirm your result for the particular 
case of delta-functional wells, considered in the previous problem. 



-> 

x 


2.22 . Derive and analyze the characteristic equation for eigenvalues for a particle in a 
rectangular well of a finite depth: 


U(x) = 


U 0 , for |x| < a/2, 
0, otherwise. 


In particular, calculate the number of localized states as a function of well’s width a, and explore the 
limit Uo « h 2 12 ma 2 . 


2.23 . Calculate energy E of the localized state in a quantum well of an arbitrary shape U(x), 
provided that its width a is finite, and the average depth is very small: 


U « 


fi- 


2ma~ 


where U = — J U{x)dx . 


well 
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2.24 . * A particle of mass m is moving in a field with the following potential: 

U (x) = U 0 (x) + ^S(x ) , 

where Uo(x) describes a smooth, symmetric function with C/ 0 (0) = 0, growing monotonically at x — > ±qo. 

(i) Use the WKB approximation to derive the characteristic equation for the energy spectrum; 

(ii) semi-quantitatively describe the spectrum structure evolution at the increase of | 'W |, for both 
signs of this parameter, and make the results more specific for the quadratic potential 

U 0 {x) = —a>- 0 x-. 

2.25 . Prove Eq. (191), starting from Eq. (190). 

2.26 . For the problem explored in the beginning of Sec. 7, i.e. ID particle’s motion in a delta- 
functional periodic potential shown in Fig. 24, 

+00 

U(x) = ^ ^ S(x - ja), with"^>0, 

j=-CC 

(where j are integers), write explicit expressions for its eigenfunctions: 

(i) at the bottom, and 

(ii) at the top 

of the lowest energy band. Sketch both eigenfunctions. 

2.27 . * A ID particle of mass m moves in an infinite periodic system of very narrow and deep 
quantum wells that may be described by delta-functions: 

+00 

U(x) = 'W 'Y u 8{x-ja), with^cO. 

j=-CC 

(i) Sketch the energy band structure of the system for relatively small and relatively large values 
of the quantum well’s “area” \U\, and 

(ii) calculate explicitly the ground state energy of the system in the limits of very small and very 
large \vf\. 

2.28 . * For the system discussed in the previous problem, write explicit expressions for the 
eigenfunctions of the system, corresponding to: 

(i) the bottom points of the lowest energy band, and 

(ii) the top points of that band, and 

(iii) the lowest points of each higher energy band, 

and sketch the functions. 
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2.29 / The ID “crystal”, analyzed in the last two problems, now extends along only to x > 0, 
while bordering a flat potential step at x = 0: 100 

+co 

^S(x - jci), with it) < 0, for x > 0, 
j = 1 

U 0 > 0, for x < 0. 

Prove that the system can have a set of so-called Tamm states, localized near the “surface” x = 0, and 
calculate their energies in the limit when Do is very large but finite. (Quantify this condition.) 

2.30 . Calculate the whole transfer matrix of the rectangular tunnel barrier, specified by Eq. (76), 
for particle energies both below and above Do. 


D(x) = < 


2.31 . Use results of the previous problem to 
calculate the transfer matrix of one period of the periodic 
Kronig-Penney potential shown in Fig. 30b (reproduced in 
Fig. on the right). 



2.32 . Using results of the previous problem, derive the characteristic equations for particle’s 
motion in the periodic Kronig-Penney potential, for both E < Do and E > Do. Try to bring the equations 
to a form similar to that obtained in Sec. 5 for the delta-functional barriers - see Eq. (166). Use the 
equations to formulate the conditions of applicability of the tight-binding and weak-potential 
approximations, in terms of parameters Do, d, and a of the potential profile, and particle’s mass m and 
energy E. 

2.33 . * For the Kronig-Penney potential, use the tight binding approximation to calculate the 
widths of the allowed energy bands. Compare the results with those of the previous problem (in the 
corresponding limit). 

4s t m ... 

2.34 . For the same Kronig-Penney potential, use the weak potential limit formulas to calculate 
the energy gap widths. Again, compare the results with those of Problem 30, in the corresponding limit. 

2.35 . ID periodic chains of atoms may exhibit what is called the so-called Peierls instability, 
leading to the Peierls transition to phase in which atoms are slightly displaced by Ax,- = (-l/Ax, with Ax 
« a. These displacements lead to the alternation of coupling amplitudes 8 n (see Eq. (204)) between 
some values 8„ + and S„\ Use the tight-binding approximation to calculate the resulting change of the n th 
energy band, and discuss the result. 


100 In applications to electrons in solid-state crystals, the delta-functional quantum wells model the attractive 
potential of atomic nuclei, while Do represents the workfunction, i.e. the energy necessary for the extraction of an 
electron from the crystal to the free space - see, e.g., EM Sec. 2.6 and SM Sec. 6.4. 
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2.36 . Assuming the quantum effects to be small, calculate the lower part of 
the energy spectrum of the following system: a small bead of mass m, free to move 
without friction along a ring of radius R that is rotated about its vertical diameter 
with a constant angular velocity co - see Fig. on the right. 101 Formulate a 
quantitative condition of validity of your results. 



2.37 . A ID hannonic oscillator (with mass m and frequency coo) had been in its ground state; 
then an additional force F was suddenly applied (and retained constant in time). Find the probability of 
the oscillator staying in its ground state. 


2.38 . A ID particle of mass m has been placed into a quadratic potential well (111), 


U(x) = 


in on 


and allowed to relax into the ground state, harmonic oscillator had been in its ground state. At t = 0, the 
well starts to be moved with velocity v, without changing its profile, so that at t > 0 the above formula 
for U is valid with the replacement x — » x ’ = x - vt. Calculate the probability for the system to still be in 
the ground state at t > 0. 


2.39 . AID particle is placed into the following potential well: 


U(x) = 


|+oo, 

[ moj^x (i) 2 / 2 , 


for x < 0, 
for x > 0. 


(i) Find its eigenstates and eigenenergies. 

(ii) This system had been let to relax into its ground state, and then the potential wall at x < 0 
was rapidly removed, so that the system was instantly turned into the usual harmonic oscillator (with the 
same m and coo). Find the probability for the oscillator to be in its ground state. 


2,40 . Prove the following fonnula for the propagator of the ID hannonic oscillator: 

r - V /2 r 


G(x,t;x 0 ,t 0 ) = 


mco n 


2mhsm[a> 0 (t -t 0 )] j 


expl 


unco. 


o 


2ftsin[&> 0 (? -t 0 )] 


[(x 2 + Xq )cos [&> 0 ( t - 1 0 )] - 2xx 0 . 


Discuss the relation between this formula and the propagator of a free ID particle. 


2,41 . Use the variational method to estimate the ground state energy E g of the following confined 
ID systems: 

2 2 

(i) a harmonic oscillator, with U(x) = ma>o x /2, and 

(ii) a particle in the following potential well: U(x) = -UoCxp j-erV}, and Uo > 0. 


101 This system was used as the analytical mechanics “testbed problem” in the CM part of this series, and the 
reader is welcome to use any relations derived there - but remember that they pertain to the classical mechanics 
domain! 
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In the latter case, get explicit results in the limits of small and large Uo, and give their interpretation. 

2.42 .* Use the WKB approximation to calculate the lifetime of the metastable ground state of a 
ID particle of mass m in the “pocket” of the potential profile 


U(x) = 


mco, 


0 2 


x -ax 


Contemplate the significance of this problem. 


2,43 . In the context of the Sturm oscillation theorem mentioned in Sec. 10, prove that the number 
of zeros of stationary wavefunctions of a particle, confined in an arbitrary potential well, always 
increases with energy. 

Hint : You may like to use the suitably modified Eq. (189). 
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Chapter 3. Higher Dimensionality Effects 

The coverage of multi-dimensional problems of wave mechanics in this course is minimal: it is limited 
to a few phenomena {such as the AB effect and Landau levels) that cannot take place in one dimension 
due to topological reasons, and a few key 3D problems {such as the Born approximation in scattering 
theory and the Bohr atom) whose solutions are necessary for numerous applications. 


3.1. Quantum interference and the AB effect 

In the past two chapters, we have already discussed some effects of the de Broglie wave 
interference. For example, standing waves inside a quantum well, or even on the top of a tunnel barrier, 
may be considered as a result of the incident and reflected waves. However, there are some remarkable 
new effects made possible by the spatial separation of such traveling waves, and such separation 
requires a higher (either 2D or 3D) dimensionality. A good example of such separation is provided by 
the Young-type experiment (Fig. 1) in which particles are passed through two narrow holes (or slits) is 
an otherwise opaque partition. 



W oc w(r) 


Fig. 3.1. Scheme of the “two-slit” 
(Y oung-type) interference experiment. 


with 2 slits 


If the particles emitted by the source do not interact (which is always true if the emission rate is 
sufficiently low), the average rate of particle counting by the detector is proportional to the probability 
density w(r, t) = t) v F*(r, t) to find a single particle at the detector’s location r, where ^(r, t) is the 
solution of the single-particle Schrodinger equation (1.25). Let us describe this experiment for the case 
when the particles may be represented by monochromatic waves of energy E (e.g., very r-long wave 
packets), so that the wavefunction may be taken in the form given by Eqs. (1.56) and (1.61): Hfir, t) = 
i/fr) exp {-iEt/h}. In this case, in the free-space parts of the system, i/fr) satisfies the stationary 
Schrodinger equation (1.60) with Hamiltonian (1.27a): 


-2lvy = e v . 

2 m 


(3.1a) 


1 /9 • 

With the standard definition k = (2mE) ~/h, it may be rewritten as the 3D Helmholtz equation 

3D 

Helmholtz 
equation 


V 2 i// + k 2 i// = 0 


(3.1b) 
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- an evident 3D generalization of Eqs. (1.75) or (2.81). 

The opaque parts of the partition may be well described as classically forbidden regions, so if 
their size scale a is much larger than the wavefunction penetration depth 8 (2.67), we can use on their 
surface S the same boundary conditions as for the quantum barrier of infinite height: 

V\s=0- ( 3 - 2 ) 


Equations (1) and (2) fonnulate the standard boundary problem of the theory of propagation of 
scalar waves of any nature. For an arbitrary geometry, such problem does not have a simple analytical 
solution. However, for a conceptual discussion of interference we use certain natural assumptions that 
will allow us to find its particular, approximate solution. 


First, let us discuss wave emission, into free space, by a small-size source located at the origin. 
Naturally, the emitted wave should be spherically-symmetric: y/{ r) = i/Ar). Using the well-known 
expression for the Laplace operator in spherical coordinates, 1 we then reduce Eq. (1) to an ordinary 
differential equation 


1 d f 2 

r dr 


dy/ 

dr 


+ k 2 if r = 0 . 


(3-3) 


Let us introduce a new function, j[r) = n/Ar). Plugging the reciprocal relation y/=fr into Eq. (3), we see 
that it is reduced to the ID wave equation, 

Tf + t ! / = 0, (3.4) 

dr~ 

whose solutions were discussed in detail in Sec. 2.2. For a fixed k, the general solution of Eq. (4) is 

f = f,c ikr + f_e- ikr (3.5) 

so that the full wavefunction 

¥( r) = L e ikr + f_ e -ikr, i. e . y( M ) = A e Hkr-cot) + f_ ^(kr+cot) > with 0 s !L = Ml. (3 . 6 ) 
r r r r fi 2m 

If the source is located at point r ’ ^ 0, the obvious generalization of Eq. (6) 

vp( r ,Q = l^ e KkR-cot) + f^ e -i(kR+cot)' withi? = | R | ? R = r _ r < (3.7) 

R R 

The first tenn of this solution describes a spherically-symmetric wave propagating from the 
source outward, while the second one, a wave converging onto the source point r ’ from large distances. 
Though the latter solution is possible at some very special circumstances (say, when the outgoing wave 
is reflected back from a spherical shell), for our problem, only the outgoing waves are relevant, so that 
we may keep only the first tenn (proportional to f+) in Eq. (7). Note that factor R is the denominator 
(that was absent in ID geometry) has a simple physical sense: it provides the independence of the full 
probability current / = 4nR~j{R), with j(R) cc k x ¥ x ¥* cc 1 /R , of the distance R between the observation 
point and the source. 


1 See, e.g., MA Eq. (10.9). 
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function 
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Quantum 
interference: 
the pattern 
and phase 
shift 


Now let us assume that the partition’s geometry is not too complicated - for example, it is planar 
as shown in Fig. 1, and consider the region of the particle detector location far behind the partition (at z 
» 1/A:), and at a relatively small angle to it: | jc I « z. Then it should be physically clear that the 
spherical waves (7) emitted by each point inside the slit cannot be perturbed too much by the opaque 
parts of the partition, and their only role is the restriction of the set of such emitting points by the area of 
the slits. Flence, an approximate solution of the boundary problem is given by the following Huygens 
principle : the wave behind the partition looks as if it was the sum of contributions (7) of point sources 
located in the slits, with each source’s strength /+ proportional to the amplitude of the wave arriving at 
this pseudo-source from the real source - see Fig. 1. This principle finds its confirmation in strict wave 
theory, which shows 2 that with our assumptions, the solution of the boundary problem (l)-(2) may be 
presented as the following Kirchhoff integral : 


) = c J 

slits 


R 


with 



(3.8) 


If the source is also far from the partition, its wave front is almost parallel to the slit plane, and 
the slits are not too broad, we can take yAj ’) constant ( y/\f) at each slit, so that Eq. (8) is reduced to 


i//(r) = a'\ exp {ikl'\}+ a" 2 exp {ikl " 2 }, 


with a'\ 2 


_ Cj ^ 1,2 

^ 1,2 ’ 

‘ 1,2 


(3.9) 


where A\^ are the slit areas. The wavefunctions on the slits be calculated approximately 3 by applying the 
same Eq. (7) to the space before the slits: ^1,2 * (/+// 'i, 2 )exp {z^/ ’1,2} • As a result, Eq. (9) may be 
rewritten as 



(3.10) 


(As Fig. 1 shows, each of fi ,2 is the length of the full classical path of the particle from the source, 
through the corresponding slit, and further to the observation point r - see Fig. 1). 


According to Eq. (10), the resulting rate of particle counting is proportional to 


where 


w(r) = i//(r)i// (r) = 


a x 

2 

+ 

a 2 

+ 2|aja 2 

cos^ 12 , 


(3.11) 


tP\2 — k{l 2 h ) 


(3.12) 


is the difference of the total wave phase accumulations along each of two alternative trajectories. The 
last expression may be evidently generalized as 


2 For a proof of Eq. (8), see, e.g., EM Sec. 8.5. 

3 A possible (and reasonable) concern about the application of Eq. (7) to the field in the slits is that it ignores the 
effect of opaque parts of the partition. Flowever, as we know from Chapter 2, the main role of the classically 
forbidden region is providing the reflection of the incident wave towards its source (i.e. to the left in Fig. 1). As a 
result, the contribution of this reflection to the field inside the slits is insignificant is A 1,2 » d 2 , and even in the 
opposite case provides just some rescaling of the amplitudes < 21 , 2 , which is unimportant for our conceptual 
discussion. 
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(3.13) 


with integration along the virtually closed contour C (see the dashed line in Fig. 1), i.e. from point 1, in 
the positive (i.e. counterclockwise) direction to point 2. (From our experience with the ID WKB 
approximation we may expect such generalization to be valid even if k changes, sufficiently slowly, 
along the paths.) 

Our result (11) shows that the counting rate oscillates as a function of the difference (h - h) that 
in turn changes with detector’s position, giving the famous interference pattern, with the amplitude 
proportional to the product | aiaf , and hence vanishing if any of the slits is closed. For a wave theory, 
this is a well-known result, 4 but for particle physics, is was (and still is :-) rather shocking. Indeed, our 
analysis pertains to a very low particle emission/detection rate, so that there is no other way to interpret 
it rather than resulting from particle’s interference with itself, or rather the interference of its 
wavefunction parts passing through each of two slits. 

Let us now discuss a very interesting effect of magnetic field on the quantum interference. In 
order to make the discussion simpler, let us consider an alternative version of the two-slit experiment, in 
which each of alternative path is fixed to a narrow channel using a partial confinement - see Fig. 2. (In 
this arrangement, moving the particle detector without changing channels’ geometry, and hence local 
values of k may be more problematic in experimental practice, so let us thi nk about its position r fixed.) 



In this case, because of the effect of the walls providing the path confinement, we cannot use 
expressions (10) for amplitudes ai, 2 - However, from the discussions in Sec. 1.6 and Sec. 2.2, it should 
be clear that the first of expressions (10) remains valid, though may be with a value of k specific for 
each channel. 

The benefit of this geometry is that we can now apply magnetic field 3, perpendicular to the 
plane of particle motion, that would pierce contour C, but would not touch the particle propagation 
channels. In classical physics, magnetic field’s effect on a particle with electric charge q is described by 
the Lorentz force 5 

F s = q\x 3, (3.14) 


4 See, e.g., a detailed discussion in EM Sec. 8.4. 

5 See, e.g., Sec. 5.1. Note that Eq. (14), as well as all other formulas of this course, are in the SI units; in Gaussian 
units, all terms which include either 3 or A should be divided by c, the speed of light in free space. 
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where S is the field value at the point of its particle’s location, so that for the experiment shown in Fig. 
2, Fjs = 0, and the field would not affect the particle motion at all. In quantum mechanics, this is not so, 
and the field does affect the probability density w, even if 3 = 0 in all points where the wavefunction 
yAj) is not equal to zero. 

In order to describe this surprising effect, let us first develop a general framework for account of 
effects of electromagnetic fields on a quantum particle, which will also give us some important by- 
product results. In order to do that, we need to calculate the Hamiltonian operator of a charged particle 
in the field. For an electrostatic field, this hardly present any problem. Indeed, from classical 
electrodynamics we know that such field may be presented as a gradient of its electrostatic potential (j), 

£ = -V <j)(r ), (3.15) 

so that the force exerted by the field on a particle with electric charge q, 

F £ =qe, (3.16) 

may be described by adding the potential energy of the field, 

U( r) = q</>{r), (3.17) 

to other (possible) components of the full potential energy of the particle. As we have already discussed, 
such a function of coordinates may be included to the Hamiltonian operator just by adding it to the 
kinetic energy operator (1.27). 

However, magnetic field’s effect is peculiar: since its Lorentz force (14) cannot do any work on 
the particle: 

dW 3 = F 3 ■ dr = F g • \dt = q(\x3) ■ \dt = 0, (3.18) 

the field cannot be presented by any potential energy, so it may not be immediately clear how to account 
for it in the Hamiltonian. Help comes from the analytical-mechanics approach to classical 
electrodynamics: 6 in the nonrelativistic limit, the Hamiltonian function of a particle in electromagnetic 
field looks superficially like that in electrostatic field only: 

H=^ + U=jf- + qt; (3.19) 

2 2 m 

however, the momentum p = m\ that participates in this expression is now the difference 

p = P-gA. (3.20) 

Here A is the vector-potential, defined by the well-known relations for the electric and magnetic field: 7 

f) A 

£ = -Vtj) , 3=VxA, (3.21) 

dt 

while P is the canonical momentum whose Cartesian components may be calculated (in classics) from 
the Lagrangian function, 8 using the standard formula of analytical mechanics, 


6 See, e.g., EM Sec. 9.7. 

7 See, e.g., EM Sec. 6.7, in particular Eqs. (6.106). 

8 Just for reader’s reference, the classical Lagrangian corresponding to Hamiltonian (19) is 
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Pj= — . (3.22) 

J dVj 

To emphasize the difference between the two momenta, p = m\ is frequently called the 
kinematic momentum (or “/wv-momentum”). The distinction between p and P = p + qA becomes even 
more clear if we notice that vector-potential is not gauge-invariant: according to the second of Eqs. (21), 
at the so-called gauge transformation 

A — > A + V % , (3.23) 


with an arbitrary single-valued scalar gauge function x = Z( r > the magnetic field does not change. 
Moreover, according to the first of Eqs. (21), if we make the simultaneous replacement 


</>^</> 


of 

8t ’ 


(3-24) 


the gauge transformation does not affect the electric field either. With that, the gauge function does not 
change the classical particle’s equation of motion, and hence the velocity v and momentum p. Hence, 
the kinematic momentum is gauge-invariant, while P is not, because it changes by qVx- 


Now the standard way of transfer to quantum mechanics is to treat the canonical rather than 
kinematic momentum according to correspondence postulate discussed in Sec. 1.2. This means that in 
the coordinate representation, the operator of this variable is given by Eq. (1.26): * * * 9 


P = -m . 

Hence the Hamiltonian operator corresponding to the classical function (19) is 


(3.25) 


H = m ~ V A Y + q</> = 

2m 2m 


( 8 

2 

V - — A 

+ q<f>, 

v h ) 



(3.26) 


so that the Schrodinger equation of a particle moving in electromagnetic field (but otherwise free) is 

(3.27) 



We may now repeat all the calculations of Sec. 1.4 for the case A ^ 0, and readily get the 
following generalized expression for the probability current density: 


(3.28) 



L = — — \- qv - A — q<f> 

- see EM Sec. 9.7. Note that this function includes A within a term that cannot be interpreted as either the purely 

kinetic energy (as the first term) or the purely potential energy (as the last term with the minus sign). 

9 The validity of this choice is clear from the fact that if the kinetic momentum was described by this differential 
operator, the Hamiltonian operator corresponding to the classical Hamiltonian function (19) would not include the 
magnetic field at all, and hence solutions of the corresponding Schrodinger equation could not satisfy the 
correspondence principle. 


Canonical 

momentum 

operator 


Charged 
particle 
in EM field 
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AB 

effect 


We see that the current density is gauge-invariant (as required for any observable) only if the 
wavefunction’s phase cp changes as 


(p^(p + (3.29) 

n 

This may be a point of concern: since the quantum interference is described by the spatial dependence of 
phase (p, can the observed interference pattern depend on the gauge function choice (which would not 
make sense)? Fortunately, this is not true, because the spatial phase difference between two interfering 
paths, participating in Eq. (11), is gauge-transformed as 

<P\2 ~*<Pn + ( p(z 2 -Z^ (3-30) 

n 

But x has to be a single-valued function of coordinates, hence in the limit when points 1 and 2 coincide, 
X\ = Xh so that A cp (and hence the interference pattern) is gauge-invariant. 

However, the difference cp may be affected by the magnetic field, even if it is localized outside 
the channels in which the particle propagates. Indeed, in this case the field cannot not affect particle’s 
velocity and current density j : 

j( r )|s*o = j( r )|®=o > (3-31) 

so that the last fonn of Eq. (28) yields 


V <p( r)| = V (p{ r) | s=0 + ^ A . (3.32) 

n 

Integrating this equation along contour C (Fig. 2), for the phase difference between points 1 and 2 we 
get 


^12 ^12 



(3.33) 


where the integral should be taken along the same virtually closed contour C as before (in Fig. 2, from 
point 1, counterclockwise along the dashed line to point 2). But from the classical electrodynamics we 
know 10 that as points 1 and 2 are overlapped, i.e. contour C becomes closed, the last integral is just the 
magnetic flux ® = \^ n d 2 r through any smooth surface limited by contour C, so that Eq. (33) may be 
presented as 


(Pn\ 


3*0 


= <Pn\ 


3=0 


+ — ® . 


(3.34a) 


In terms of the interference pattern, this means a shift of interference fringes, proportional to the 
magnetic flux (Fig. 3). This phenomenon is usually called the “Aharonov-Bohm” (or just the AB) 
effect . n For particles with a single elementary charge, q = ±e, this result is frequently presented as 


10 See, e.g., EM Sec. 5.3. 

11 I personally prefer the latter, less personable name, because the effect had been actually predicted by W. 
Ehrenberg and R. Siday in 1949, before it was rediscovered by Y Aharonov and D. Bohm in 1959. To be fair to 
Aharonov and Bohm, it was their work that triggered a wave of interest to the phenomenon, resulting in its first 
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i 1-0 

( P\2\ ~ ^P\2 | 3=0 — ' , 

^ n 


(3.34b) 


where the fundamental constant O 0 ’ = 2 ndile = h/e ~ 4.14x 10' 15 Wb has the meaning of flux necessary to 
change cpn by 2n, i.e. shift the interference pattern (1 1) by one period, and is called the normal magnetic 
flux quantum, because of the reasons we will soon discuss. 



Fig. 3.3. Typical results of a two-paths interference experiment by A. Tonomura et al., Phys. Rev. 
Lett. 56, 792 (1986), showing the AB effect for electrons well shielded from the applied magnetic 
field. In this particular experimental geometry, the AB effect produces a relative shift of the 
interference patterns inside and outside the dark ring, (a) O = O 0 72, (b) ® = O 0 ’. © 1986 APS. 


The AB effect may be “almost explained” classically, in terms of Faraday’s electromagnetic 
induction. Indeed, a change AO of magnetic flux in time causes a vortex-like electric field around it. 
That field is not restricted to the magnetic field’s location, i.e. may reach particle’s trajectories. The 
field’s magnitude (or rather of its integral along contour C) may be readily calculated by integration of 
the first of Eqs. (21): 

AF = fA^-dr = — — , (3.35) 

•L dt 


I hope that in this expression the reader readily recognizes the integral (“undergraduate”) form of 
Faraday’s induction law. Now let us assume that the variable separation described in Sec. 1.5 may be 
applied to the end points 1 and 2 of particle’s alternative trajectories as two independent systems, 12 and 
that the magnetic flux’ change by certain amount A® does not change the spatial parts y/j of 
wavefunctions of these systems. Then change (35) leads to the change of potential energy difference A U 
= qAV between the two points, and repeating the arguments that were used in Sec. 2.3 at the discussion 
of the Josephson effect, we may rewrite Eq. (2.53) as 

rf^ = _At/ = _ lAr = i <® ( 336 ) 

dt ft ft ft dt 

Integrating this relation over the time of magnetic field’s change, we get 


experimental observation by R. Chambers in 1960 and several other groups soon after that. Later, the experiments 
were improved, using ferromagnetic cores and/or superconducting shielding to provide better separation between 
the electron trajectories and the applied magnetic field - see in the work whose results are shown in Fig. 3. 

12 This assumption may seem a bit of a stretch, but the resulting relation (37) may be indeed proven for a rather 
realistic model, though that would take more time and space that I can afford. 
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A<^i2 


= -AO , 

n 


(3.37) 


- superficially, the same result as given by Eq. (34). 

However, this interpretation of the AB effect is limited. Indeed, it requires the particle to be in 
the system (on the way from the source to the detector) during the flux change, i.e. when the induced 
electric field & may affect its dynamics. On the contrary, Eq. (34) predicts that the interference pattern 
would shift even if the field change has been made when the there is no particle in the system, and hence 
field 3 could not be felt by it. Experiment confirms the latter conclusion. Hence, there is something in 
the space where a particle propagates (i.e., outside of the magnetic field region), which transfers 
information about even the static magnetic field to the particle. The standard interpretation of this 
surprising fact is as follows: the vector-potential A is not just a convenient mathematical tool, but a 
physical reality (just as its electric counterpart </j), despite the large freedom of choice we have in 
prescribing specific spatial and temporal dependences of these potentials without affecting any 
observable - see Eqs. (23)-(24). 


Let me briefly discuss the very interesting form the AB effect takes in superconductivity. In this 
case, our results require two changes. The first one is simple: since superconductivity may be interpreted 
as the Bose-Einstein condensate of Cooper pairs with electric charge q = 2e, ®o ’ has to be replaced by 
the so-called superconducting flux quantum 13 


Super- 

conducting 

flux 

quantum 


®n 


— « 2.07x10 15 Wb = 2.07x10 7 Gs-cm 2 . 
e 


(3.38) 


Second, since the pairs are Bose particles and are all condensed in the same quantum state, 
described by the same wavefunction, the total electric current density, proportional to the probability 
current density j, may be extremely large - in real superconducting materials, up to —10 “ A /nr. In these 
conditions, one cannot neglect the contribution of that current into the magnetic field and hence its flux 
®, which (according to the Lenz rule of the Faraday induction law) tries to compensate changes in 
external flux. In order to see possible results of this contribution, let us consider a closed 
superconducting loop (Fig. 4). 



Fig. 3.4. Flux quantization in a superconducting 
loop. 


Due to the Meissner effect (which is just another version of the flux self-compensation), current 
and magnetic field penetrate inside the superconductor by only a small distance (called the London 


13 One more bad, though common, term - a wire can (super)conduct, but a quantum hardly can! 
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penetration depth) Sl ~ 10' 7 m. 14 If the loop is made of a superconducting wire that is considerably 
thicker than cif, we can draw a contour deep inside the wire, at that the current density is negligible. 
According to Eq. (28), everywhere at the contour, 


Ve>-- A = 0. 
ft 


(3.39) 


Integrating this equation along the contour as before (from point 1 to the virtually coinciding point 2), 
we need to have the phase difference cpn = 2m, because the wavefunction y/ <x exp{z'^} in the initial 
and final points 1 and 2 should be “essentially” the same, i.e. produce the same observables. As a result, 
we get 


O = | A • dr 

c 


2 7th 7th 

n = — n = n0 o . 

q e 


(3.40) 


This is the famous flux quantization effect, 15 which justifies the term “magnetic flux quantum” for the 
constant O 0 given by Eq. (38). 


Here I have to mention in passing very interesting effects of “partial flux quantization”, that arise 
when a superconductor loop is closed by a Josephson junction, forming the so-called Superconductor 
QUantum Interference Device - “SQUID”. Such devices are used, in particular, for supersensitive 
magnetometry and ultrafast, low-power computing. 16 


3.2. Landau levels and quantum Hall effect 

In the last section, we have used the Schrodinger equation (27) for analysis of static magnetic 
field effects in “almost- ID”, circular geometries shown in Figs. 1, 2, and 4. However, this equation 
describes very interesting effects in higher dimensions as well, especially in the 2D case. Let us consider 
a uniform 2D quantum well (say, parallel to the [. x , y] plane), with strong confinement in the 
perpendicular direction z. According to the discussion in Sec. 1.6, energy-relaxed particles will always 
reside in the lowest energy subband, with constant quantization energy (E z )\. Adding this shift to well’s 
flat floor U(x ,y) = const, and taking the resulting constant energy as the reference, for the 2D motion of 
the particle in the well, we reduce Eq. (27) to the similar equation, but with the Laplace operator acting 
only in directions x and y: 


h 2 d d .q V 

n v — + n 1 — A y/ — Ey/ . 

2m [ & ' dy h J 


(3.41) 


Let us find its solutions for the simplest case when the applied static magnetic field is uniform 
and perpendicular to the plane: 


3 = 3n . 


(3.42) 


14 For more detail, see EM Sec. 6.3. 

15 It was predicted in 1949 by F. London and experimentally discovered (independently and virtually 
simultaneously) in 1961 by two experimental groups: B. Deaver and W. Fairbank, and R. Doll and M. Nabauer. 

16 A brief review of these effects, and recommendations for further reading may be found in EM Sec. 
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According to the second of Eqs. (21), this imposes the following restriction on the choice of vector- 
potential: 


3 = - 


dA v a 4 


ox dry 


(3.43) 


but the gauge transformations still give us a lot of freedom in its choice. The “natural” axially- 
symmetric form, A = n rp p3/2, where p = (x~ + y ) is the distance from some z-axis, leads to a 
cumbersome math. In 1930, L. Landau realized that the energy spectrum of Eq. (41) may be obtained by 
making a very simple choice 

A x = 0, A y = 3(x-x 0 ), (3 .44) 

(with arbitrary xo), which evidently satisfies Eq. (43), though it ignores the physical equivalence of the x 
andy directions. Now, expanding the eigenfunction into the Fourier integral in direction y: 

Mr-n ) dk _ 


V(x,y) = \x t (x)e 

we see that for each component of this integral, Eq. (41) yields a specific equation 


r 

2m 


n *-r+' n v 

ax 


k-^3(x-x 0 ) 

n 


X k = EX k . 


(3.45) 


(3.46) 


Since the vectors inside the square brackets are mutually perpendicular, its square has no crossterms, so 
that Eq. (46) may be rewritten as 


h 2 d 2 v h 2 

T X k + 

2m dx 2m 


-£>\x-x, 
fi 


o') 


X* =EX k , 


, , tik 

where x n = x n H . 

q3 


(3.47) 


But this ID Schrodinger equation is identical to Eq. (2.268) for the ID harmonic oscillator, but with the 
center at point x 0 ’, and frequency ay equal to 


co „ = 


\q\3 


m 


(3.48) 


In this expression, it is easy to recognize the classical cyclotron frequency of particle’s motion in the 


magnetic field. (It may be readily obtained using the 2 nd Newton law for a circular orbit of radius r 


m — = F 3 = qv3 , 


(3.49) 


and noting that the resulting ratio v/r = q^hn is just the radius-independent angular velocity co c of 
particle’s rotation.) Hence, the energy spectrum for each Fourier component of integral (45) is the same: 


(3.50) 


and does not depend on either xo, or vo, or k. 

This is an example of a highly degenerate system: for each eigenvalue E n , there are many 
different eigenfunctions that differ by the positions {xo, Vo} of their center on axis x, and the rate k of 
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their phase change along axis y. They may be used to assemble a large variety of linear combinations, 
including 2D wave packets whose centers move along classical circular orbits with some radius r 
determined by initial conditions. Note, however, that such radius cannot be smaller than the so-called 
Landau radius, 



(3.51) 


which characterizes the minimum radius of the wave packet itself, and results from Eq. (2.271) after 
replacement (48). This radius is remarkably independent on particle’s mass, and may be interpreted in 
the following way: the scale 3A m ; n of the applied magnetic field’s flux through the effective area A mm = 

2my of the smallest wave packet is just one normal flux quantum cD 0 ’ = 2 Tth/q. 


A detailed analysis of such wave packets (for which we would not have time in this course) 
shows that magnetic field does not change the average density dNfdE of different 2D states on the 
energy scale, but just “assembles” them on the Landau levels (see Fig. 5a), so that the number of states 
on each Landau area (per unit area) is 


n 


L 



1 dN 0 , . _ 1 dN, 

A dE 1 A dE 


m 

2m 


ql=> q3 
2 7th h 


(3.52) 


This expression may again be interpreted in terms of magnetic flux quanta: /j[.®o ’ = -S, i.e. there is one 
particular state on each Landau level per each flux quantum. 



Fig. 3.5. (a) “Condensation” of 
2D states on Landau levels, and 
(b) filling the levels by external 
electrons at the quantum Flail 
effect. 


The most famous application of the Landau levels concept is the explanation of the quantum 
Hall effect 11 . Generally, the Hall effect 18 is observed in the geometry sketched in Fig. 6, where electric 
current / is passed through a thin rectangular conducting sample (frequently called the Hall bar ) placed 
into a magnetic field 3 perpendicular to the sample plane. The classical analysis of the effect is based on 
the notion of the Lorentz force (14). This force the deviates charge carriers (say, electrons) from their 
straight motion from one external electrode to another, bending them to the isolated edges of the bar (in 
Fig. 6, parallel to axis x). Here the carriers accumulate, generating a gradually increasing electric field 3, 
until its force (16) exactly balances the Lorentz force (14): 


17 It was first observed in 1980 by K. von Klitzing and coworkers. 

18 Discovered in 1879 by E. Hall. 
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q3 y = qv x 3 , (3.53) 

where v x is the drift velocity of the electrons along the bar (Fig. 6), providing the sustained balance 
condition v x = 3 Z at each point of the 2D sample. 



Fig. 3.6. Hall bar geometry. Darker 
rectangles show external (3D) electrodes. 


With «2 carriers per unit area, in a sample of width W, this condition yields the following 
classical expression for the so-called Hall resistance Rh'- 

Classical 
Hall 
effect 

This formula is broadly used in practice for the measurement of the carrier density na, and (in 
semiconductors) the carrier type - negative electrons or positive holes. 

Flowever, in experiments with high-quality (low-defect) 2D quantum wells at very low, sub- 
kelvin temperatures 19 and high magnetic fields, the linear growth of Rh with ^ described by Eq. (54), is 
interrupted by virtually horizontal plateaus (Fig. 7) with constant values 

R h =-R k , (3.55) 

i 



(3.54) 


where i (only in this context, following tradition!) is an integer, and value 


Rk~ 25.812807557 kQ (3.56) 

is reproduced with extremely high accuracy (~ 1 0‘ 9 ) from experiment to experiment and from sample to 
sample. Such stability is a very rare exception in solid state physics were most results are noticeably 
dependent on the particular material and particular sample under study. 


Let us apply the Landau level picture. The 2D sample is typically in a weak contact with 3D 
electrodes whose conductivity electrons form a Fermi sea with certain Fermi energy Ep, so that at low 
temperatures all states with E < Ep are filled with electrons - see Fig. 5b. As ^ is increased, spacing 


Quantum 

Hall 

effect 


hco c between the Landau levels increases, so that fewer and fewer of these levels are below Ep and are 
filled, and within broad ranges of field variation, the number i of filled levels is constant. (In Fig. 5b, i = 
2.) So, plugging ti 2 = ini and q = ±e into Eq. (54), we get 


Rh 


1 3 1 2 Tih 1 h 

~ • 2 • 2 5 
/ qn L i e i e 


(3.57) 


19 Recently, the quantum Hall effect was observed at room temperature in the graphene (a virtually perfect 2D 
sheet of carbon atoms, see Sec. 4 below) - see K. Novoselov et al.. Science 315 , 1379 (2007). 
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i.e. exactly the experimental result (55), with 


Rk 



(3.58) 


This constant, exactly 4 times the quantum unit of resistance Rq given by Eq. (2.259), is in an excellent 
agreement with experimental value (56), and is sometimes called the Klitzing constant. 


Rh 



Fig. 3.7. Typical record of the quantum 
Flail effect. The lower trace (with sharp 
peaks) shows the longitudinal component, 
VJI X , of the resistance tensor. (Adapted 
from www.prequark.org/Prequark.htm .) 


However, this oversimplified explanation of the quantum Hall effect does not take into account 
several important factors, including: 

(i) the role of nonuniformity of the quantum well bottom potential U(x, y), and of the localized 
states this nonuniformity produces, and the surprisingly small effect of these factors on the extraordinary 
accuracy of Eq. (55); 20 and 

(ii) the mutual Coulomb interaction of the electrons, in high-quality samples leading to the 
formation of R H plateaus with not only integer, but also fractional values of i (1/3, 2/5, 3/7, etc.). 21 

Unfortunately, a thorough discussion of these interesting features is well beyond the framework 
of this course. 22 


3.3. Scattering and diffraction 

The second class of quantum effects that become more rich in multi-dimensional space is 
typically referred to as either diffraction or scattering - depending on the context. (Diffraction is 
essentially the interference, but of waves emitted by several many coherent sources.) Just as in the two - 


20 The explanation of this paradox may be obtained in terms of the so-called quantum edge channels - the quasi- 
ID regions of width (51), along the lines were the Landau levels cross the Fermi surface. Particle motion along 
these channels, which is responsible for electron transfer, is effectively one-dimensional and thus cannot be 
affected by modest uniformities of the potential distribution U(x, y). 

21 This fractional quantum Hall effect was discovered in 1982 by D. Tsui, H. Stormer, and A. Gossard. In 
contrast, the effect described by Eq. (55) with integer i (Fig. 7) is now called the integer quantum Hall effect. 

22 For a comprehensive discussion of these effects I can recommend, e.g., either the monograph by D. Yoshioka, 
The Quantum Hall Effect, Springer, 1998, or the review by D. Yennie, Rev. Mod. Phys. 59, 781 (1987). 
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slits in the Young-type experiment (Fig. 1), these sources are most frequently the elementary re-emitters 
of some initial wave from a single source. More generally, such re-emitting is called scattering; this term 
is also applied to particles - even if their quantum properties may be ignored. 23 


Figure 8 shows the general scattering situation. Most commonly, the detector of scattered 
particles (in the quantum case, read de Broglie waves) is located at a large distance r » a from the 
scatterer. 24 In this case, the main observable independent of r is the flux (number of particles per unit 
time) of particles scattered in a certain direction, i.e. their flux per unit solid angle. Since such flux is 
proportional to the incident flux of particles per unit area, the ability of the scatterer to re-emit in a 
particular direction may be characterized by the ratio of these two fluxes. This ratio has the 
dimensionality of area per unit angle, and is called the differential cross-section of the scatterer: 


Differential 

cross- 

section 


do _ flux of scatterd particles per unit solid angle 
c/Q flux of incident particles per unit area 


(3.58) 



incident 

particles 



Full 

cross- 

section 


Such name and notation stem from the fact that the integral of dddQ. over all scattering angles, 



total flux of scattered particles 
incident flux per per unit area 


(3.59) 


(also with the dimensionality of area), has a simple interpretation as the full cross-section of scattering. 
For the simplest case when a macroscopic solid object scatters all classical particles hitting its surface, 
but does not affect the particles flying by it, cr is just the geometrical cross-section of the object, as 
visible from the direction of incoming particles. 


In classical mechanics, 25 we first calculate the particle scattering angle as a function of the 
impact parameter b, and then average the result over all values of b, considered random. In this sense 
the calculations in wave mechanics are simpler, because a parallel beam of incident particles of fixed 
energy E may be fairly presented by a plane de Broglie wave 


zk 0 r 

Wo = a 0 e 


(3.60) 


23 See, e.g., CM Sec. 3.7. 

24 In optics, this limit is called the Fraunhofer diffraction - see, e.g., EM Secs. (8.6) and (8.8). 

25 For example, in the simplest task of derivation of the so-called Rutherford formula for scattering of a charged 
nonrelativistic particle by a point fixed charge, due to their Coulomb interaction - see, e.g., CM Sec. 3.7. 
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1/2 

with the free-space wave number ko = {2m E) th and constant probability current density (1 .49): 

fi 


Jo K ^o- 


m 


(3.61) 


This current density is exactly the flux of incident particles per unit area that is used in the denominator 
of definition (58), so the “only” remaining thing to do is to calculate the nominator of that fraction. 

To do this, let us write the Schrodinger equation for the scattering problem (now in the whole 
space including the scatterer) in the form 

[e-H q ) ¥ = U{ r) ¥ , (3.62) 

where 


H 0 =-—V 2 , and 
2m 


n 2 kl _ h 2 k 2 

2m 2m 


(3.63) 


the potential energy t/(r) describes the effect of scatterer. Looking for the solution of Eq. (62) in the 
natural form 


V = Vo + V * » 


(3.64) 


where i//q is the incident wave (60), and i// s has the sense of the scattered wave, and taking into account 
that former wave satisfies the free-space equation 

H 0 <// 0 = Ey/ a , (3.65) 

we may reduce Eq. (62) to 

(e-H 0 ) ¥s = U( rX^o +¥,)• (3-66) 


The most straightforward (and common) simplification of this problem is possible if the 
scattering potential U{ r) is in some sense weak. (We will derive the exact condition of this smallness 
below.) Then since at U( r) = 0 the scattering wave t// s disappears, we may expect that at small but 
nonvanishing U{ r), the main part of t// v is proportional to its scale Uq. Then all terms in Eq. (66) are 
proportional to Uq, besides the product Ui// S , which is proportional to Uq . Hence, in the first 
approximation in Uq, that term may be ignored, and Eq. (66) reduces to the famous equation of the Born 
approximation : 26 


{ E-H a )y/ S = U{r)y/ Q 


(3.67a) 


This simplification changes the situation drastically, because the linear superposition principle 
allows finding an explicit solution of this equation (in the form of an integral) for an arbitrary function 
U(r). Indeed, after rewriting Eq. (67a) as 


26 Named after M. Bom, who was the first one to apply this approximation in quantum mechanics. However, the 
basic idea of this approach had been developed much earlier (in 1881) by Lord Rayleigh in the context of 
electromagnetic wave scattering - see, e.g., EM Sec. 8.3. Note that the contents of that section repeats much of 
our current discussion - regrettably but unavoidably so, because the Bom approximation is a centerpiece of 
scattering theory for both electromagnetic and de Broglie waves. 


Born 

approximation 
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^+k 1 ) V ,=^-U{ r)r.(r), 

n 


(3.67b) 


we may notice that y/ s is just a response of a linear system to a certain “excitation” (represented by the 
right-hand part) that is fixed, i.e. does not depend on the solution. Hence we can present i// v as a sum of 
responses to elementary excitations from elementary volumes ar 


Spatial 

Green’s 

function 


V a ( r ) = J ■ u ( r Vo ( r VO* , r V V ' • 


(3.68) 


Here G( r, r ’) is the spatial Green ’s function, defined as such an elementary response of the free-space 
Schrodinger equation to a point excitation, i.e. the solution of the following equation 27 


(v 2 +k 2 )G = J(r-r'). 


(3.69) 


But we already know the physically-relevant spherically-symmetric solution of this equation - see Eq. 
(7) and its discussion: 


G(r,r') = L^ e ikR . 
R 


(3.70) 


so that we need just to calculate the coefficient f+ for Eq. (67). This can be done in several ways, for 
example by noticing that at r « k A , the second term in Eq. (70) is negligible, and it is reduced to the 
well- kn own Poisson equation with delta-functional right-hand part, which describes, for example, the 
electrostatic potential generated by a point electric charge. Either recalling the Coulomb law, or 
applying the Gauss theorem, 28 we readily get the asymptote 

G -> — , at hr « 1, (3.71) 

4 nR 


which is compatible with Eq. (70) only if f+ = -1/4#, i.e. if 

Green’s 
function 
for free 
space 

Plugging this result into Eq. (68), we get the final solution of Eq. (67): 



(3.72) 


(3.73) 


Note that if function U(v) is smooth, the singularity in the denominator is integrable (i.e. not dangerous); 
indeed, the contribution of a sphere of radius 5? — » 0, with the center in point r ’ = 0, scales as 


f — = 4n\^_^- = 4x\RdR = 2xlZ 2 

J- R JR J 


R<P 


R 


0 . 


(3.74) 


27 Please notice both the similarity and difference between this Green’s function and the propagator discussed in 
Sec. 2.1. In both cases, we use the linear superposition principle to solve wave equations, but while Eq. (68) gives 
the solution of the inhomogeneous equation (67), Eq. (2.44) does that for a homogeneous Schrodinger equation in 
which the wave sources are presented by initial conditions rather than by equation’s right-hand part. 

28 See, e.g., EM Sec. 1.2. 
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Actually, Eq. (73) gives us more than we wanted: it evaluates the scattered wave at any point, 
including those within of the scattering object, while our goal was to find the wave far from the scatterer 
- please revisit Fig. 8 if you need. However, before going to that limit, we can use the general formula 
to find the quantitative criterion of the Bom approximation’s validity. Indeed, let us estimate the 
magnitude of the right hand part of this equation, for a scatterer of linear size ~a, and the potential 
magnitude scale Uq, in two limits: 

(i) If ka 1, then inside the scatterer (i.e., at distances / ~~ ( t) . both i exp fki ) and the 

second exponent under the integral change slowly, so that a crude estimate of the solution is 


W' 


m 

27th 2 


U 0 \i// 0 \a 2 . 


(3.75) 


(ii) In the opposite limit ka »1, the integration along one of the dimensions (that of the wave 
propagation) is cut out on distances of the order of the de Broglie wavelength k'\ so that the integral is 
correspondingly smaller: 



m 

27th 2 


U 0 Vo 



(3.76) 


Since the reduction of Eq. (66) to Eq. (67) requires I y/J «| >//<} everywhere within the scatterer, we may 
now formulate the conditions of this requirement as 


Uq « -ma x[ka, 1] . 

ma 


(3.77) 


In the first factor of the right-hand part, we may readily recognize the scale of the kinetic (quantum- 
confinement) energy E a of the particle inside a quantum well of size ~ a, so that the Born approximation 
is valid essentially if the potential energy of particle’s interaction with the scatterer is smaller than E a . 
Note, however, that estimates (75) and (76) are not valid in special situations when the effects of 
scattering accumulate in some direction. This is frequently the case for small scattering angles in 
extended objects (when ka » 1 but ka6 < 1), and especially in ID (or quasi- ID) scatterers oriented 
along the incident particle beam. 

Now let us proceed to large distances r » r’ ~ a, and simplify Eq. (73) using an approximation 
similar to the dipole expansion in electrodynamics. 29 In denominator’s R, we can merely ignore r’ in 
comparison with r, but the exponent requires more care, because even if r ’ ~a « r, the product kr ’ ~ ka 
may still be larger than 1 . In the first approximation in r ’, we can take (Fig. 9a): 


(a) (b) 



29 See, e.g., EM Sec. 8.2. 
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Scattering 

function 


R = r -r' « r - n, • r' , 
and since the directions of vectors k and r coincide, i.e. k = kn r , 

kR « kr - k ■ r', and e ikR » e ikr e~ ikr ’ , 


(3.78) 

(3.79) 


With this replacement, and the incident wave in form (60), the Born approximation yields 


vA*) = 


m a o ikr 


2jtf 


J U (r')e 




(3.80) 


This relation may be presented in a general form 30 


V, = a o 


j (k, k Q ) jf a - 
r 


(3.81) 


where /(k, ko) is called the scattering function? 1 Its physical sense becomes clear from the calculation 
of the corresponding probability current density j v . For that, generally we need to use Eq. (1.47) with the 
gradient operator having all spherical-coordinate components. 32 However, at kr » 1 the main 
contribution to V i// v , proportional to k » Hr, is provided by the term exp {ikr} which changes fast in 
the common direction of vectors r and k, so that 


so that Eq. (1.47) yields 


v<// s 



» Hy/ 


S 9 


at kr » 1 . 


j s {O)~—\a 0 

m 


|/(k,k,)r 


k. 


(3.82) 


(3.83) 


Since this vector is parallel to k and hence to r, the flux in the nominator of Eq. (58), i.e. the probability 

9 

current per unit solid angle, is just rj s . Hence, the differential cross-section is simply 


do _ jj 2 
dQ jo 


|/(k,k„)| 2 , 


(3.84) 


and the total cross-section is 

o = §\f (k,k 0 )| 2 <fE> , 


(3.85) 


so that the scattering function y(k, k 0 ) gives us everything we need (and in fact more, because the 
function also contains information about the phase of the scattered wave). 


30 It is easy to prove that this form is an asymptotic form of any solution y/ s of the scattering problem (even that 
beyond the Bom approximation) at sufficiently large distances r » a, k~ l . 

31 Note that function / has the dimension of length, and does not account for the incident wave. This is why 
sometimes a dimensionless function, 5=1+ 2 ikf is used instead. This function 5 is called the scattering matrix, 
because it may be considered as a natural generalization of the ID matrix S, defined by Eq. (2.133), to higher 
dimensionality. 

32 See, e.g.,MAEq. (10.8). 
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According to Eq. (80), in the Born approximation the scattering function may be presented as the 
Born integral 

/(k,k 0 ) = -^{[/(r) e _i ' q - r t/V, (3.86) 

where for the notation simplicity I have replaced r ’ with r, and also introduced the scattering vector 

q = k -k 0 , (3.87) 


with length q = 2k sm{0/2), where 6 is the scattering angle between vectors k and k 0 - see Fig. 9b. For 
the differential cross-section, Eq. (86) yields 


da 


m 

2 7ifr 


jV(r)e-^V 


(3.88) 


and the total cross-section may be now readily calculated from the first of Eqs. (59). 


33 


This is the main result of this section; it may be further simplified for spherically-symmetric 
scatterers, with 


U(r) = U(r). 


(3.89) 


Here, it is convenient to present the exponent in the Born integral as zxp{-iqr ’cosj}, where % is the 
angle between vectors k (i.e. the direction n r toward the detector) and q (rather than the incident wave 
vector ko!) - see Fig. 9b. Now, for fixed q, we can take this vector’s direction as the polar axis of a 
spherical coordinate system, and reduce Eq. (86) to a ID integral: 


/( k,k 0 ) = - 


m 


Ln n 


2 7th' 
m 

2 7ih~ 


- J r 2 drU (r) | dcp^ sin xdx exp {-iqr } cos xi 
0 0 

2 sin qr 2m 


(3.90) 


jr 2 drU(r) 2 n- 


qr 


h 2 q 


^U(r)sin(qr)rdr. 


As a simple example, let us use the Born approximation to analyze scattering on the following 
spherically-symmetric potential: 


U(r)=U 0 exp 



(3.91) 


In this particular case, it is better to avoid the temptation to exploit the spherical symmetry by using Eq. 
(90), and instead use the generic Eq. (88), because it falls apart into a product of three similar Cartesian 
factors: 


with 


/(k,k 0 ) 


mU 0 
2 7th 2 


W. . 


(3.92) 


33 Note that according to Eq. (88), in the Born approximation the scattering intensity does not depend on the sign 
of potential U, and also that scattering in a certain direction is completely determined by a specific Fourier 
harmonic of function U(r), namely by the harmonic with the wave vector equal to the scattering vector q. 


Differential 
cross- 
section 
in the Born 
approximation 
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/ 


X 


j ex p - 


-co 



\ 

+ iq x x 


■dx. 


(3.93) 


and similar integrals for I y and I z . From Chapter 2, we already know that Gaussian integrals like I x may 
be readily worked out by complementing the exponent to the full square, in our current case giving 


I x = (2 n) vi a exp{- q* 0 >, etc. , 


so that, finally, 




mU n 


\ 2 


2 jth 


2 w. 


= 2 na‘ 


r mU 0 a 2V ~ 


2 2 

q a 


(3.94) 


(3.95) 


Now, the total cross-section a is an integral of da/dQ over all directions of vector k. Since in 
our case the scattering intensity does not depend on the azimuthal angle cp, the integration is reduced to 
that over the scattering angle 0 (Fig. 9b): 


a = [ C ^-dQ = 2n\^-sm0d6 = An 
4 dQ. J 


da . 
dQ, 


■ 2 a 2 


f 1 nU 0 a 2 


J shift/# exp j 


f . e ^ 


2k sin — 
2 


= An a 


2 2 


f T T 2 ^\ Z 0=71 

mU 0 a 


J exp{- 2k 2 a 2 (l - cos #)}c/( 1 -COS6 1 ) = 


e=o 


2n 

( mU 0 a 2 'I 

2 

k 2 

l J 

_ 


(3.96) 


1-e 


-Ak'a 


2 2 


Let us analyze these fonnulas. In the low-energy limit, ka « 1 (and hence qa « 1 for any 
scattering angle), the scattered wave is virtually isotropic: da/dQ ~ const - a very typical feature of 
scattering by small objects, in any approximation. Notice that in this limit, the Born expression for a, 


a « 8 n 2 a 2 


f TT 2 A 2 

mu 0 a 


(3.97) 


is only valid if a is much smaller than the scale a ” of the physical cross-section of the scatterer. 


In the opposite, high-energy limit ka »1, the scattering is dominated by small angles 6 ~ q/k ~ 

l/ka ~ A/a: 


da 

dQ 


~ 2m 1 




v 


mU 0 a 

n 2 


2\ 2 


) 


exp) - 



(3.98) 


This is, again, very typical for diffraction. Notice, however, that due to the smooth character of the 
Gaussian potential (91), the diffraction pattern exhibits no oscillations; such oscillations of da/dQ as 
function of angle naturally appear for potentials with sharp borders - see, e.g., Problems 2 and 3. 

The Born approximation, while being very simple and used more often than any other scattering 
theory, is not without substantial shortcomings, as is clear from the following example. It is not too 
difficult to prove the following general optical theorem, valid for an arbitrary scatterer: 
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Im/(k 0 ,k 0 ) 


k 

= cr . 

An 


(3.99) 


However, Eq. (86) shows that in the Born approximation, function /is purely real at q = 0 (i.e. k = ko), 
and hence cannot satisfy the optical theorem. Even more evidently, it cannot describe such a simple 
effect as a dark shadow (y/~ 0) cast by an opaque object (say, with Uq » E). 


Optical 

theorem 


There are several ways to improve the Bom approximation, while still holding the general idea 
of approximate treatment of U. 


(i) Instead of the main assumption y/ s oc Uo, we can use a complete perturbation series: 


Vs = Vx + Vi +••• 


(3.100) 


with y/ n oc U(", and find successive approximations <//„ one by one. In the 1 st approximation we of course 
return to the Born formula, but already the 2 nd approximation yields 

Im/ 2 (k 0 ,k 0 )=-^c7 1 , (3.101) 

4 n 


where <j\ is the full cross-section calculated in the 1st approximation, so that the optical theorem (99) is 
“almost” satisfied. 34 

(ii) As was mentioned above, the Born approximation does not work very well for small-angle 
scattering by extended objects. This deficiency may be corrected by the so-called eikonal approximation 
(from Greek word sikov, meaning “icon”) that replaces the plane wave exponent exp{ikox} 
representation of the incident wave by a WKB-like exponent, though still in the first nonvanishing 
approximation in U 0: 


(3.102) approximation 


This approximation’s results satisfy the optical theorem (99) already in the 1 st approximation in U. 


£ ikx e j 


(}«*')*'} = exp 


-dA*e ikx - m 


fl 2 k 


X 

| U (x')dx' . 


3.4, Energy bands in higher dimensions 

In Sec. 2.5, we have discussed the ID band theory for potential profiles U{x) that obey the 
periodicity condition (2.192). For what follows, let us notice that that condition may be rewritten as 

U(x + X) = U(x), (3.103) 


34 The construction of such series may be facilitated by the following observation. If we retain y/ s in the right- 
hand part of Eq. (66), we may write a relation formally similar to Eq. (68) for the full wavefunction y/ = i/a, + y/ s : 

V(r) = ip 0 (r) + ^$U(r V(r ')G(r, r ')d V . 

This is one of forms of the Lipmann-Schwinger equation that is exactly equivalent to the differential Schrodinger 
equation (66) but is more convenient for some applications, in particular for the calculation of higher 
approximations y/ n . Unfortunately, I will have not time to discuss this approach in detail and have to refer the 
reader, for example, to either Chapter 9 of the textbook by L. Schiff, Quantum Mechanics, 3 rd ed., McGraw-Hill, 
1968, or (for even more details) to monograph by J. Taylor, Scattering Theoiy, Dover, 2006. 
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Bravais 
lattice 
and its 
potential 


where X— za, with r being an arbitrary integer. One can say that the set of points X forms a periodic ID 
lattice in the direct (x-) space. We have also seen that each Bloch state (i.e., each eigenstate of the 
Schrodinger equation for such periodic potential) is characterized by the quasi-momentum hq and its 
energy does not change if q is changed by a multiple of 2rda. Hence if we form, in the reciprocal (k-) 
space, a ID lattice of points Q = lb, with b = a and integer /, any pair of points from these two 
mutually reciprocal lattices satisfies the following rule: 

cxp{iQX} = Qxplil^-zai = e 2/rizl =1. (3.104) 


In this form, the results of Sec. 2.5 may be readily extended to ^-dimensional periodic potentials 
whose translational symmetry obeys the following generalization of Eq. (103): 


U(r + R) = U(r) , 


(3.105) 


where points R, which may be numbered by d integers Zj, form the so-called Bravais lattice 35 of points 

(3.106) 



with d primitive vectors a The simplest example of a 3D Bravais lattice are given by the simple cubic 
lattice (Fig. 10a), which may be described by the system of mutually perpendicular primitive vectors a, 
of equal length. However, not in any lattice these vectors are perpendicular; for example Figs. 10b and 
10c show possible sets of the primitive vectors describing the face-centered cubic lattice (fee) and body- 
centered cubic lattice (bee). In 3D, the science of crystallography, based on the group theory, 
distinguishes, by their symmetry properties, 14 Bravais lattices grouped into 7 different lattice 
systems? 6 





Fig. 3.10. The simplest (and most common) 3D Bravais lattices: (a) simple cubic, (b) face-centered cubic 
(fee), and (c) body-centered cubic (bee), and possible choices of their primitive vector sets (blue arrows). 


Note, however, not all highly symmetric sets of points form Bravais lattices. As probably the 
most striking example, nodes of the very simple 2D honeycomb lattice (Fig. 11a) cannot be described by 


35 Named after A. Bravais, the crystallographer who introduced this notion in 1850. 

36 The strongest motivation for the band theory is provided by properties of solid crystals. Thus it is not surprising 
that perhaps the most clear, well illustrated introduction to the Bravais lattices may be found in Chapters 4 and 7 
of the famous textbook by N. Ashcroft and N. Mermin, Solid State Physics , Saunders College, 1976. 
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a Bravais lattice - while the 2D hexagonal lattice, shown in Fig. lib, can. The most prominent 3D case 
of such a lattice is the diamond structure (Fig. 11c), which describes, in particular, atoms of world’s 
most important crystal - silicon. 37 In cases like these, the band theory is much facilitated by the fact that 
the Bravais lattices using some point assemblies (called primitive unit cells) may describe these point 
systems. For example, Fig. 11a shows the possible choice of primitive vectors for the honeycomb 
structure, 38 with the primitive unit cell fonned by any two adjacent points of the original lattice (say, 
within the dashed ellipses in Fig. 11a). Similarly, the diamond lattice may be described as the fee 
Bravais lattice with two-point primitive unit cell. 39 

Now we are ready for the following generalization of the ID Bloch theorem, given by Eqs. 
(2.193) and (2.210), to higher dimensions. Any eigenfunction of the Schrodinger equation describing 
particle’s motion in the periodic potential (105) may be presented either as 

(3.107) 

(3.108) 

where the quasi-momentum ft q is again a constant of motion, but now is a vector. 


or as 


i//(r + R) = y/{ r)e' q ' R , 

^(r) = u(r)e ,q r , with u(r + R) = if(r), 




(c) 


Fig. 3.11. Some important periodic structures that require two-point primitive cells for their Bravais lattice 
presentation: (a) 2D honeycomb lattice and their primitive vectors and (c) 3D diamond lattice. For a contrast, 
panel (b) shows the 2D hexagonal structure which forms a Bravais lattice with a single-point primitive cell. 


The key notion of the band theory is the reciprocal lattice in the wavevector space, formed as 



(3.109) 


37 It may be best understood as the sum of two fee lattices of side a, mutually shifted by vector {1, 1, l}a/4, so 
that the distances between each point of the combined lattice and its 4 nearest neighbors (see the thick gray lines 
in Fig. lie) are all equal. 

38 This structure is presently very popular due to the recent discovery of graphene - isolated monolayer sheets of 
carbon atoms arranged in a honeycomb lattice with the interatomic distance of 0.142 nm. 

39 A harder case is presented by quasicrystals (whose idea may be traced down to medieval Islamic tilings, but 
was discovered in natural crystals, by D. Shechtman et al., only in 1984), which obey high (say, 5-fold) rotational 
symmetry, but cannot be described by a Bravais lattice with any finite primitive unit cell. For a popular review of 
quasicrystals see, for example, P. Stephens and A. Goldman, Sci. Amer. 264, #4, 24 (1991). 
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with integer lj, and vectors b 7 selected in such way that the following generalization of Eq. (104) is valid 
for any pair of points of the direct and reciprocal lattices: 

e /QR =l. (3.110) 


The importance of lattice Q is immediately clear from the first formulation of the Bloch theorem, given 
by Eq. (107): if we add to q any vector Q of the reciprocal lattice, the wavefunction does not change. 
This means that all information about the system is contained in just one elementary cell of the 
reciprocal space q. Its most frequent choice, called the 1 st Brillouin zone, is the set of all points q that 
are closer to the origin than to any other point of lattice Q. 


Primitive 
vectors 
of the 
reciprocal 
lattice 


It is easy to see that primitive vectors bj of the reciprocal 3D lattice 40 may be constructed from 
those of the initial, direct lattice as 


b, =2 n a r Xa3 

a r v 


a 2 xa 3 )’ 


b 2 = In 


a 3 x a. 


a i ’ ( a 2 x a 3 ) ’ 


b 3 = 2 n- 


■(a 2 xa 3 )' 


(3.111) 


Indeed, from the “operand rotation rule” of the vector algebra 41 it is evident that ayby = 2nSy. Hence, 
the exponent in the left-hand part of Eq. (1 10) is reduced to 


e 'Q-R _ exp{2^z'(/ 1 r 1 + / 2 r 2 + / 3 r 3 )}. 


(3.112) 


Since all lj and Zj are integers, the expression in the parentheses is also an integer, so the exponent 
indeed equals 1, thus satisfying the definition of the reciprocal lattice given by Eq. (110). 

As the simplest example, let us return to the simple cubic lattice of period a (Fig. 10a), oriented 
in space so that 

aj=an v , a 2 = an , a 3 =an z , (3.113) 

According to Eq. (1 1 1), its reciprocal lattice is (of course) also cubic: 

Q = — (l x n x +/ v n v +/_-n 3 ), (3-114) 

a 


so that the 1 st Brillouin zone is a cube with side b = 2 n/a. Almost similarly simple calculations show that 
the reciprocal lattice of fee is bee, and vice versa. Figure 12 shows the resulting 1 st Brillouin zone of the 
fee lattice. 


The notion of the reciprocal lattice 42 makes the multi-dimensional band theory not much more 
complex than that in ID, especially for numerical calculations, at least for the single-point Bravais 
lattices. Indeed, repeating all the steps that have led to Eq. (2.218), but now with a <7-dimcnsional 
Fourier expansion of functions U{ r) and u/( r), we readily get its generalization: 

,=(£-£>,, (3.115) 

1V1 


40 For the 2D case (j = 1, 2), one may use, for example, the first two formulas of Eq. (Ill) with a 3 = aixa 2 . 

41 See, e.g., MA Eq. (7.6). 

42 This notion is also the main starting point of X-ray diffraction studies of crystals, because it allows rewriting 
the well-known Bragg condition for diffraction peaks in an extremely simple form of the momentum conservation 
law: k = k 0 + Q, where k 0 and k are the wave vectors of the, respectively, incident and diffracted photon. 
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where 1 is now a t/-dimensional vector of integer indices lj. The summation in Eq. (115) should be 
carried over all (essential) components of this vector (i.e. over all relevant nodes of the reciprocal 
lattice), so writing a corresponding computer code requires a bit more care than in ID; however, this is 
just a homogeneous system of linear equations, and numerous routines of finding its eigenvalues E are 
readily available from both public sources and commercial software packages. 43 


9z 



Fig. 3.12. 1 st Brillouin zone of the fee lattice, and the 
traditional notation of its main directions. Adapted from 
http://en.wikipedia.org/wiki/Band structure . 


What is indeed more complex than in ID is the presentation (and hence the comprehension :-), of 
the calculation results and experimental data. Typically, the presentation is limited to plotting the Bloch 
state eigenenergy as a function of components of vector q along certain special directions the reciprocal 
space of quasi-momentum (see, e.g., the lines shown in Fig. 12), typically plotted on single panel. 
Figure 12 shows perhaps the most famous (and certainly the most practically important) of such plots, 
the band structure of silicon. The dashed horizontal lines mark the “indirect” gap of width 1.12 eV 
between the “valence” and “conduction” energy bands, which is the playground of virtually all silicon- 
based electronics. 


E [eV] 



43 See, e.g., MA Sec. 16 (iv). 
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In order to understand the reason of this band structure presentation complexity, let us see how 
we would start to develop the weak-potential approximation for the simplest case of a 2D square lattice 
(which is a subset of the cubic lattice, with r 3 = 0). Its 1 st Brillouin zone is of course also a square, of 
area (Inlay . Let us draw the lines of constant energy of a free particle (U = 0) in this zone. Repeating 
the arguments of Sec. 2.7 (see especially Fig. 2.28 and its discussion), we should conclude that Eq. 
(2.216) should now be generalized as follows, 


E = E l = 


n z k z _ r 

2 m 2 m I 


f 2 n l r 

<h 


2 nl. 


+ 


V 


a ) 


(3.116) 


with all possible integers l x and l v . Considering the result only within the 1 st Brillouin zone, we see that 
as energy E grows, the lines of equal energy evolve as shown in Fig. 14. Just like in ID, the weak- 
potential effects are only important at the Brillouin zone boundaries, and may be crudely considered as 
the appearance of narrow energy gaps, but one can see that the band structure in q-space is complex 
enough even without these effects. 



(b) 


(c) 



Fig. 3.14. Lines of constant energy 
£ of a free particle, within the 1 st 
Brillouin zone of a square Bravais 
lattice, for: (a) E!E\ ~ 0.95, (b) E/E\ 
~ 1.05; and (c) E!E\ ~ 2.05, where 
E\ = jdtfUma . 


The tight-binding approximation is usually easier to follow. For example, for the same square 2D 
lattice, we may repeat the arguments that have led us to Eq. (2.203), to write 44 


iti 


da 0 o 
dt 


■s,U 


-1,0 b tf+1,0 b ^O.+l b a 0. 


(3.117) 


where indices correspond to the deviations of integers r r and x y from an arbitrarily selected minimum of 
the potential energy - and hence wavefunction’s “hump” quasi-localized at this minimum. Now, looking 
for the stationary solution of these equations, that corresponds to the Bloch theorem (107), instead of 
Eq. (2.206) we get 


E = E„ +s„ = E„ -8\ e 


tq a 


+ e 


~iq a 


+ e 


IC ly a 


+ e 


-iq v a 


= E„ - 


2S n {cosq x a + cosg v a). (3.118) 


Figure 15 shows this result, within the 1 st Brillouin zone, in two forms: as the color-coded lines of equal 
energy and as a 3D plot (also enhanced by color). 


44 Actually, using the same values of S n in both directions implies some sort of symmetry of the quasi-localized 
states. For example, .v-statcs of axially-symmetric potentials (see the next section) always have such a symmetry. 
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<?v 


Fig. 3.15. Allowed band 
energy s n = E-E n for a square 
2D lattice, in the tight-binding 
approximation. 


It is evident that the plots of this function along different lines on the q-plane, for example along 
one of axes (say, q x ) and along a diagonal of the 1 st Brillouin zone (say, q x = q y ) give different curves, 
qualitatively similar to those of silicon (Fig. 13). The latter structure is complicated by the fact that the 
primitive cell of their Bravais lattices contains more than 2 atoms - see Fig. lie and its discussion. In 
this case, even the tight-binding picture becomes more complex. Indeed, even if the atoms in the 
different positions of the primitive unit cell are similar (as they are, for example, in both graphene and 
silicon), and hence the potential well shape near those points and the corresponding local wavefunctions 
w(r) are similar as well, the Bloch theorem (which only pertains to Bravais lattices!) does not forbid 
them to have different complex amplitudes a(t) whose time evolution should be described by a specific 
differential equation. 

For example, in order to describe the honeycomb lattice shown in Fig. 1 la, we have to prescribe 
different amplitudes to the “top” and “bottom” points of its primitive cell - say, a and fi 
correspondingly. Since each of these points is surrounded (and hence weakly interacts) with 3 neighbors 
of the opposite type, instead of Eq. (1 17) we have to write two equations 




7= 1 


dt 


a r 


r = i 


(3.119) 


where each summation is over 3 next-neighbor points. (I am using different summation indices just to 
emphasize that these directions are different for the “top” and “bottom” points of the primitive cell - see 
Fig. 11a.) Now using the Bloch theorem (107) in the form similar to Eq. (2.205), we get two coupled 
systems of linear algebraic equations: 

(E-E,)a = -Sjf j e“' r \ {£-£,)/}— (3.120) 

7=1 /= 1 


where r y and x’y are the next-neighbor positions, as seen from the top and bottom points, respectively. 
Writing the condition of consistency of this system, we get two equal and opposite values for energy 
correction for each value of q: 

E ± =E n ±<7„Z 1/2 , where I = ^Y q '( r ' +I >). (3.121) 

7,7-1 
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According to Eq. (120), these two energy bands correspond to the phase shifts (on the top of the regular 
Bloch shift q-Ar) of either 0 or ^between the adjacent quasi-localized wavefunctions w(r ). 

The most interesting corollary of such energy symmetry, augmented by the honeycomb lattice 
symmetry, is that for certain values qo of vector q (that turn out to be in each of 6 corners of the 
honeycomb-shaped 1 st Brillouin zone), the double sum X vanishes, i.e. the two band surfaces £ = (q) 
touch each other. As a result, in vicinities of these Dirac points 45 the dispersion relation is linear: 


q*q t 


E n ± hv lt q , where q = q - q 


D 9 


(3.122) 


with v„ oc 8 n being a constant with the dimension of velocity (for graphene, close to 10 6 m/s). Such a 
linear dispersion relation ensures several interesting transport properties of graphene. For their 
discussion, I have to refer the reader to special literature. 46 


3.5. Axially-symmetric systems 

I cannot conclude this chapter (and hence our review of wave mechanics) without addressing the 
issue of eigenstates and eigenvalues at full confinement in multi-dimensional potentials U(r). For an 
arbitrary potential, the stationary Schrodinger equation does not have an analytical solution, but a 
substantial symmetry of function U(r) may make such solution possible. This pertains, in particular, to 
the axial symmetry in 2D problems and the spherical symmetry in 3D problems, which are typical for 
several important situations (or their reasonable models), especially in atomic and nuclear physics. 

In rare cases such symmetry may be exploited by the separation of variables in Cartesian 
coordinates. The most famous example is the (/-dimensional harmonic oscillator, i.e. a particle moving 
inside the potential 


U = 


ill (On \ ’ 9 

-AX/ • 

Z 7=1 


(3.123) 


Separating the variables exactly as we did for the rectangular quantum well (see Sec. 1.5), for each 
degree of freedom we get the Schrodinger equation (2.268) of a ID oscillator, whose eigenfunctions are 
given by Eq. (2.278), and the energy spectrum is described by Eq. (2.1 14). As a result, the total energy 
spectrum may be indexed by a vector n = {n\, in,..., fid} of d independent integers (quantum numbers): 


E„ = ha>r 


2>7 + a 

w=i 2 


(3.124) 


45 This term is based on a (pretty loose) analogy with the Dirac theory of relativistic quantum mechanics, to be 
discussed in Chapter 9 below. Namely, in the vicinity of a Dirac point (122), Schrodinger equations (119), and 

hence the dispersion relation (122), may be obtained from the effective Hamiltonian H n = fiv n 6 ■ q . (Since 
vector q is two-dimensional, this Hamiltonian employs only two of three Pauli matrices.) This expression 
reminds the first term of Dirac’s Hamiltonian (9.97), which is defined, however, in a different Hilbert space. 

46 See, e.g., a recent review by A. Castro Neto et al., Rev. Mod. Phys. 81 , 109 (2009). Note that transport 
properties of graphene are determined by coupling of 2 p : electron states of carbon atoms, whose wavefunctions 
are proportional to exp {±i(p} rather than are axially-symmetric as implied by Eqs. (120). However, due to the 
lattice symmetry this fact does not affect the dispersion relation A7(q). 
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all of them ranging from 0 to oo. Note that every energy level of this system, with the only exception of 
the ground state, 


¥ 


g 


riMo) 


j = i 


— d 14 d / 2 
X(\ 


expi 


2x, 


■Z' 


0 7=1 


(3.125) 


is degenerate : several different wavefunctions, each with its own different set of quantum numbers n h 
but the same value of their sum, have the same energy. 

However, the harmonic oscillator problem is an exception: for other central- and spherically- 
symmetric problems the solution is made easier by using more appropriate coordinates. Let us start with 
the simplest axially-symmetric problem: the so-called planar rigid rotator (or “rotor”), i.e. a particle 
constrained (confined) to move along a plane, round circle of radius R (Fig. 15). 47 



The planar rotator has just one degree of freedom, say the displacement arc / = Rep . So, its 
classical energy (and Hamiltonian function) is H = pi 12 m, pi = mv = m(dl/df). This function is similar to 
that of a free ID particle (with the replacement x — » /), and hence rotator’s quantum properties may be 
described by a similar Hamiltonian operator: 

/V 7? 2 (3 

H = ^—, with p = -ih—, (3.126) 

2m dl 

and its eigenfunctions have a similar structure: 

y/ = Ce ikl . (3.127) 

The “only” new feature is that in the rotator, all observables should be 2 ^-periodic functions of /, and 
hence, as we have already discussed in the context of the magnetic flux quantization (see Fig. 4 and its 
discussion), as the particle makes one turn about the center, its wavefunction’s phase kl may only 
change by 2m, with an arbitrary integer n (from -go to +oo),: 

¥ n (/ + 2/rR) = i// K (l)e 27Tin . (3.128) 

With eigenfunctions (127), this immediately gives condition gives k 2uR = 2 mi. Thus, wavenumber k 
can take only quantized values k n = n/R, so that the eigenfunctions should be indexed by n: 

Planar rotator: 
(3.129) eigenfunctions 



47 This is a reasonable model for the confinement of light atoms, notably hydrogen, in some organic compounds. 
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and the energy spectrum is discrete: 

Planar rotator: 
eigenenergies 


So, while the free translation motion of a quantum particle is continuous, in the sense that its 
momentum has a continuous spectrum, its rotation is quantized - the most important fact, which has so 
many implications (including the existence of atoms, molecules, and hence us humans, and hence 
science including this course 

This simple model allows an exact analysis of external magnetic field effects on a quantum- 
confined motion of an electrically charged particle. Indeed, if this field is uniform and directed 
perpendicular to rotator’s plane, it does not violate the axial symmetry of the system. According to Eq. 
(26), in this case we have to generalize Eq. (126) as 



(3.130) 


H = 


2m 


- ifin m qA 

p 81 


(3.131) 


Here, in contrast to the gauge choice (44), which was so instrumental in the Landau level problem, it is 
now clearly beneficial to take the vector-potential in a manifestly axially-symmetric form A = A(p)n (p , 
where p = {x, y} is the 2D radius-vector. Using the well-known expression for curl in cylindrical 
coordinates, 48 we can readily check that the requirement VxA = with 3 = const, is satisfied by the 
following function: 

A = • < 3132 ) 


For the planar rotator, p = R = const, so that the stationary Schrodinger equation becomes 


2m k 



3R 


w = E w . 

t n nr n 


(3.133) 


Planar rotator 
in magnetic 
field 


A little bit surprisingly, this equation is still satisfied with the sine -wave eigenfunctions (127). 
Moreover, since the periodicity condition (128) is also unaffected by the applied magnetic field, we 
return to field-independent eigenfunctions (129). However, the field does affect the system’s energy: 


E - 1 

(Tm 3R ' 

2 h 2 

\ 0 'l 

2 

? 

2 m 

[r q 2 J 

2 mR 2 



(3.134) 


2 

where <D = nR 3 is the magnetic flux through the area limited by the particle’s trajectory, and 0 O ’ = 


2 jih/q is the “normal” magnetic flux quantum we have already met in the AB effect context - see Eq. 
(34) and its discussion. The field also changes the electric current of the particle in n - th state: 


J n =q — 


2 im 


Vn 


8 iqR 3 

dl 2 h 


Vn~ C-C. 
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= q 


mR 


C, 


n - ■ 


O 

07 

o y 


(3.135) 


2 

Normalizing wavefunction (129) to have W n = 1, we get | C„ | = 1/2 nR, so that Eq. (135) becomes 


48 See, e.g., MA Eq. (10.5). 
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(3.136) 


Functions E„( O) and 7„ (®) are shown in Fig. 17. Note that since ®o ’ oc l/q, for any sign of the 
particle’s charge, dljdd) <0. It is easy to check that this means that the current is diamagnetic , 49 i.e. 
corresponds to the Lenz rule of the Faraday’s electromagnetic induction: the field-induced current flows 
in such direction that its own magnetic field tries to compensate the external magnetic flux applied to 
the loop. 



Fig. 3.17. Effect of magnetic field on a 
charged planar rotator. Dashed lines show 
possible inelastic transitions between 
metastable and ground states, due to weak 
interaction with environment, as the 
magnetic field is being increased. 


This result may be interpreted as a different implementation of the AB effect. 50 In contrast to the 
two-slit interference experiment that was discussed in Sec. 1, in the situation shown in Fig. 17 the 
particle is not absorbed by the detector, but travels around the ring continuously. As a result, its 
wavefunction is rigid: due to the boundary condition (128), the topological quantum number n is 
discrete, and magnetic field cannot change the wavefunction gradually. In this sense, the system is 
similar to a superconducting loop - see Fig. 4 and its discussion. The difference between these systems 
is two-fold: 

(i) For a single charged particle, in a macroscopic systems with practicable values of q, R, and m, 
the current scale 7 0 is very small. For example, for m = m e , q = -e, and R = 1 pm, Eq. (136) yields 7 0 « 3 
pA. 51 The contribution LI ~ jiioRIo ~ 10' 24 Wb of the current so small into the net magnetic flux is 


49 This effect, whose qualitative features remain the same for all 2D or 3D localized states (see Chapter 6 below), 
is frequently referred to as the orbital diamagnetism. In magnetic materials consisting of particles with 
uncompensated spins, this effect competes with another effect, spin paramagnetism - see, e.g., EM Sec. 5.5. 

50 It is straightforward to check that Eqs. (133) and hence (135) remain valid even if the magnetic field lines do 
not touch the particle’s trajectory, and the field is localized well inside rotator’s ring. 

51 Such persistent, macroscopic diamagnetic currents in non-superconducting systems may be experimentally 
observed, for example, by measuring the weak magnetic field generated by electrons in a system of a large 
number (~10 7 ) of similar conducting rings - see, e.g., L. Levy et al., Phys. Rev. Lett. 64 , 2074 (1990). Due to the 
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negligible in comparison with ®o ’ ~ 10' 15 Wb, so that the quantization of n does not lead to the magnetic 
flux quantization. 

(ii) As soon as the magnetic field raises the eigenstate energy E n above that of another eigenstate 
E n ', the former state becomes metastable, and weak interactions of the system with its environment 
(which are neglected in our simple model) may induce a quantum transition of the system to the lower- 
energy state, thus reducing the diamagnetic current’s magnitude - see the dashed lines in Fig. 17. The 
flux quantization in superconductors is much more robust to such perturbations. 52 

Now let us return, for one more time, to Eq. (129), and see what do they give for one more 
observable, particle’s angular momentum 

Lsrxp, (3.137) 

In our current problem, vector L has just one component perpendicular to the rotator plane, 

L : =Rp. (3.138) 

In classical mechanics, L z of the rotator should be conserved (due to the absence of external torque), but 
can take arbitrary values. In quantum mechanics the situation changes: with p = Tik, our result k„ = n/R 
may be rewritten as 

L : = (L z ) n = Rfik n = tin . (3.139) 


Thus, the angular momentum is quantized: it may be only a multiple of the Planck constant fi - 
confirming Bohr’s guess - see Eq. (1.10). As we will see in Chapter 5, this result is very general (though 
may be modified by spin effects) and that wavefunctions (129) may be interpreted as eigenfunctions of 
the angular momentum operator. 

In order to implement the planar rotator in our 3D world, we needed to provide rigid 
confinement of the particle both in the motion plane, and along radius p. Let us proceed to the more 
general problem when only the fonner confinement is strict, i.e. to a 2D particle moving in an arbitrary 
centrally-symmetric potential 

U(p) = U(p). (3.140) 


Using the well-known expression for the 2D Laplace operator in polar coordinates, 53 we may present the 
2D stationary Schrodinger equation in the form 


2 m 


]__d_ 
P dp 
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_ 8 _ 

dp 


+ - 


i a 2 

p~ d(p 1 


y + U(p)y = Ey . 


(3.141) 


Separating the radial and angular variables as 54 


dephasing effects of electron scattering by phonons and other electrons, the effect’s observation requires 
submicron samples and millikelvin temperatures. 

52 Interrupting a superconducting ring with a weak link (Josephson junction), i.e. forming a SQUID, we may get 
the switching behavior similar to that shown with dashed arrows in Fig. 17 - see, e.g., EM Sec. 6.3. 

53 See, e.g., MA Eq. (10.3) with d/dz = 0. 

54 At this stage, I do not want to mark the particular solution (eigenfunction) y and corresponding eigenenergy E 
by any index, because we already may suspect that in a 2D problem the role of this index will be played by two 
integers - two quantum numbers. 
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¥ = P)H<P ) > 

we get, after the division by y/ and multiplication by p 1 , the following equation: 
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2 m 
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F dp 
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+ p 2 U(p) = p 2 E. 


(3.142) 


(3.143) 


2 2 

It is clear that the fraction {d F/d(p)/F should be a constant (because all other tenns of the equation may 
be only functions of p alone), so that we get for function F(tp) an ordinary differential equation, 

d 2 F 


dtp 2 


■ + v 1 F = 0. 


(3.144) 


where V is the variable separation constant. The fundamental solution of Eq. (144) is evidently F oc 
exp{±z v(p\. Now requiring, as we did for the planar rotator, the 2/r periodicity of any observable, i.e. 

2 mn 


F{cp + 2 n) = F(cp)e z 
so that constant v has to be integer (say, n), and we can write: 55 

F = C p m<p 

1 n ^ ’ 

2 2 2 

Plugging the resulting relation ( d Fld(p)IF = -n into Eq. (143), we may rewrite is as 

+ U(p) = E. 

The physical interpretation of this equation is that the full energy is a sum, 
of the radial-motion part 


h 2 

1 d | 

f d 

2 

n 

2 m 

1 

P , 

{ dp j 

P 2 _ 


e = e p +e v . 


fi 2 1 d 

E = 
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(3.145) 

(3.146) 

(3.147) 

(3.148) 

(3.149) 


and the angular-motion part 


E„ = 


4-2 2 

n n 


2m p 


2 ‘ 


(3.150) 


Now let us notice that a similar separation exists in classical mechanics, 56 because the total 
energy of a particle moving in a central field may be presented, within the plane of motion, as 


E = !jv 1 + U(p ) = j (p 1 + pip 2 )+ U(p) = £„+£, 


(3.151) 


where 


55 Noting that for the planar rotator (Fig. 16) HR = (p, we can present Eq. (129) in a similar form. This is natural, 
because the rotator is just a particular case of our current problem - with a rigid confinement along axis p. 

56 See, e.g., CM Sec. 3.5. 
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E f m^ + U(p), 
2m 


L 

E - ^ 

* 2m p 2 


(3.152) 


The comparison of the latter relation with Eqs. (139) and (150) gives us grounds to suspect that the 
quantization rule L : = nh may be valid for this problem as well, and may be in other cases as well. In 
Sec. 5.6, we will see that this is indeed the case. 

Returning to Eq. (147), on the basis of our experience with ID wave mechanics we may expect 
that this ordinary, linear, second-order differential equation should have (for a motion confined to a 
certain final region of its argument p), for any fixed n, a discrete energy spectrum described by some 
other integer quantum number (say, I). This means that eigenfunctions (142), and corresponding 
eigenenergies (148) should be indexed by two quantum numbers. Note, however, that since the radial 
function obeys equation (147), which already depends on n, function ^ p ) should carry both indices, so 
the variable separation is not so “clean” as it was for the rectangular quantum well. Normalizing the 
angular function to the full circle, A cp = 2n, we may rewrite Eq. (142) as 

= (3.153) 


A good (and important) example of a solvable problem of this type is a free 2D particle whose 
motion is rigidly confined to a disk of radius R : 


U(p) = 


jO, for 0 < p < R, 
[+oo, fori? < p. 


(3.154) 


In this case, the solutions Ki,i(p) of Eq. (147) are proportional to the first-order Bessel functions 
J n (kip), 57 and the spectrum of possible values of parameter ki should found the boundary condition 
Ki,i(R) = 0- Let me leave the detailed solution and analysis of this problem for reader’s exercise. 


3.6. Spherically-symmetric systems: Brute force approach 

Now let us address the (mathematically more involved) case of 3D motion, with spherically- 
symmetric potential 

U(r) = U(r). (3.155) 

Let me start, again, with a rigid rotator - now a spherical rotator, i.e. a particle confined to move on the 
surface of a sphere of radius R. It has 2 degrees of freedom, because any position on the spherical 
surface is completely described by two coordinates - say, the polar angle 6* and the azimuthal angle (p. In 
this case, the kinetic energy we need to consider is limited to its angular part, so that in the Laplace 
operator in spherical coordinates 58 we may keep only those parts, with fixed r = R. Then the stationary 
Schrodinger equation becomes 


57 A short summary of properties of these function, plus a few plots and a useful table of values, may be found in 
EM Sec. 2.4. For more on of Bessel functions, see the literature recommended in MA Sec. 16(ii). 

58 See, e.g., MA Eq. (10.9). 
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(3.156) 


(Again, I abstain from attaching any indices to if/ and E for the time being.) With the usual variable 
separation assumption, 

V - ®{0)F(jp ) , (3.157) 


Eq. (156), with all terms multiplied by sin#/0F, yields 
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(3.158) 


2 2 

Just as in Eq. (143), fraction (d~F/dx )/F may be a function of (p only, and hence has to be constant, 
giving for it an equation similar to Eq. (144). So, the azimuthal functions are just the sine waves (146) 
again, and we can use the same periodicity condition (145) to write them in the normalized form 59 


F J ( P) = 


(2zr) 1/2 


imcp 


(3.159) 


2 2 2 2 

With that, fraction (d^F/dx )/F equals (-/« ), and Eq. (158), after multiplication by 0/sin" 6, is reduced to 


the following ordinary, linear differential equation for function 0(6): 
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(3.160) 


It is convenient to recast it into an equation for a new variable P(q) = 0(6), with % = cos 0 : 
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(3.161) 


where a new notation for the normalized energy is introduced: /(/+1) = e. The motivation for such 
notation is that, according to a mathematical analysis, 60 Eq. (161) with integer m, has solutions only if 
parameter I is integer: / = 0, 1,2,..., and only if that integer is not smaller than \m\, i.e. if 

— I < m < +! . (3.162) 


This immediately gives the following energy spectrum of the spherical rotator: 


_ n 2 i(i+ 1 ) 

1 2m R 2 


(3.163) 


Energy 
spectrum 
of spherical 
rotator 


59 Here, rather regrettably, I had to replace the notation of the integer from n to in, in order to comply with the 
generally accepted convention for this so-called magnetic quantum number. Let me hope that the difference 
between this integer and particle’s mass is absolutely clear from the context. 

60 It was carried out by A.-M. Legendre (1752-1833). Just as a historic note: besides many original mathematical 
results, Dr. Legendre has authored the famous textbook Elements de Geometrie which dominated teaching 
geometry through the 1 9th century. 
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so that the only effect of the magnetic quantum number m here is imposing the restriction (162) on the 
orbital quantum number l. This means, in particular, that each of energy level (163) corresponds to (2/ + 
1) different values of m, i.e. is (2/ + l)-degenerate. 


To understand the physics of this degeneracy, we need to explore the corresponding 
eigenfunctions of Eq. (161). They are naturally numbered by two integers, m and /, and are called the 
associated Legendre functions P{". For the particular, simplest case in = 0, these functions are just 
(j Legendre ) polynomials Pile) = P\(f), which may be either defined as the solutions of the Legendre 
equation following from Eq. (161) at m = 0: 


d_ 

d£, 


- 2 ' d 


(i -n—p 

dg 


+ l(l + l)P = 0, 


(3.164) 


or calculated explicitly from the following Rodrigues formula : 


61 


P f (£)= I d A f-X) 1 , 1 = 0,1,2,.... 
2'lld^' 


(3.165) 


Using this formula, it easy to spell out a few lowest Legendre polynomials: 


/>„(£) = !, P,(f) = f, />,(<?) = ft 1 -l) P,(f)U( 


(3.166) 


though such expressions become more and more bulky as / is increased. As Fig. 18 shows, as argument 
£ is decreased, all these functions start in one point, P/(+ 1 ) = + 1, and end up either in the same point or 
in the opposite point: P/(- 1) = (-l/. On the way between these two end points, the / th polynomial crosses 
the horizontal axis exactly / times, i.e. has / roots. 62 It may be shown that on the segment [-1,+ 1], the 
Lagrange polynomials fonn a full orthogonal set of functions, with the following normalization rule: 

[PAOPASW^f—S,,,. (3.167) 

2 / + 1 


1 

0.5 

m) o 

0.5 
- 1 

- 1 - 0.5 0 0.5 1 

£ = COS 0 



Fig. 3.18. A few lowest Legendre polynomials. 


61 Derived independently by B. O. Rodrigues in 1816, J. Ivory in 1824, and C. Jacobi in 1827. 

62 In this behavior, we readily recognize the standing wave pattern typical for all ID eigenproblems - cf. Fig. 1.7. 
The quantitative deviation from the sinusoidal waveform is due to the different metric of the sphere. 
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For m > 0, the associated Legendre functions may be expressed via the Legendre polynomials 
(165) using the following formula, which reminds Eq. (165): 


(£) = (-i)" a -r) 


2 \m/2 


d" 


d 


-m). 


while if the index m is negative, the following simple relation may be used: 



(3.168) 


(3.169) 


On the segment £, = [-1, +1], each set of the associated Legendre functions with fixed index m forms a 
full orthogonal set, with the normalization relation, 


-1 


2 (/ + /»)! 
2/ + 1 (/ - m)\ 1 


(3.170) 


which is evidently a generalization of Eq. (167) for arbitrary m. 


Since the difference between angles 6 and cp is to some extent artificial (caused by the arbitrary 
direction of the polar axis), physicists prefer to use not the functions 0(6) oc/ 5 /"(cos6) and F m {(p) oc 
exp \im(p\ separately, but their products (157), which are called spherical harmonics : 


rrw>9) 


( 2 / + 1 ) 

4 rc ( l + m)\ 


P; n (cosO)e im(p . 


(3.171) 


The specific coefficient in Eq. (171) is chosen in a way to simplify the following two relations: the 
equation for negative m, 


Yrv.9) = (-1 ) m [Yr(0,<p)}*, (3.172) 

and the normalization relation 


i y; (A <p)[y;X 0, V )\dci = s w s mm 


(3.173) 


with integration over the whole solid angle 4k. The last relation shows that the spherical harmonics form 
an orthonormal set of functions. This set is also full, so that any function defined on a sphere may be 
uniquely presented as a linear combination of Y/ n . 


Despite a somewhat intimidating formulas given above, they yield rather simple expressions for 
the lowest spherical harmonics: 

1 = 0: Yq = (l/ 4k) 1/2 , (3.174) 


1 = 1 : 


7/ = —(3 / 8zr) I/2 sin#e'^, 

< 7j° = (3/4 k) 1/2 cos 6) 

Y = +(3 / 8zr) 1/2 smOe~ l<p , 


(3.175) 
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1 = 2 : 


Y 2 = +(l5/32;r) 1/2 sin 2 0e 2i(p , 

Y\ = -(15/ 8;r)' ' 2 sin 9 cos 6 e 1(p , 

‘ Y 2 ° =(3/16;z-) 1/2 (3cos 2 £-1), 

Y 2 ] = +(\5 1 %7t) 11 2 sin 0 cos 6 e~ l(p , 
Y 2 2 = -(l5/32;r) 1/2 sin 2 0e 2i<p . 


(3.176) 


It is important to understand the symmetry of these functions. Since spherical functions with m ^ 
0 are complex, the most popular way of their graphical representation is first to form their real 
combinations corresponding to two opposite values of m, 63 


Y, 


lm 


1 

71 


Y‘ 


+ sgn(m)(- 1) “ Yp m ' 


oc 


[cos mcp, 
[sin mcp, 


for m > 0, 
for m < 0, 


(3.177) 


(for m = 0, 7/o = 7/°), and then plot the magnitude of these combinations in spherical coordinates as the 
distance from the origin, while using two colors to show their sign - see Fig. 19. 


/=!(/? states): 777 = -! 



1 = 2 (d states): 
777 = -2 




772 = 0 



Fig. 3.19. Several lowest real spherical 
harmonics Y bn . (Adapted from Web site 

http://people.csail.mit.edu/sparis/ .) 


772 = 0 772 =+ 1 



63 Such real functions F/ m , which also form the full set of orthonormal eigenfunctions and are frequently called the 
real spherical harmonics, are more convenient than the complex functions 7/" for several applications, especially 
when the variables of interest are real by definition. 
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Let us starting from the simplest case / = 0. According to Eq. (162), there could be only one such 
s state, 64 with m = 0. The spherical harmonic corresponding to that state is just a constant, so that the 
wavefunction is uniformly distributed over the sphere. Since the functions does not have gradient in any 
direction, the kinetic energy (163) of the particle equals is zero. 

For 1=1, there could be 3 different p states, with m = -1,0, and +1. As the second row in Fig. 19 
shows, these states are essentially identical in structure, and are just differently oriented in space, thus 
explaining the 3-fold degeneracy of the kinetic energy - see Eq. (163). This is not quite true for 5 
different d states (/ = 2), shown in the bottom row of Fig. 19, as well as states with higher /: despite their 
equal energies, they differ not only by their special orientation. The states with m = 0 have gradient only 
in the 6 direction, while the states with the ultimate values of m (m = ±1) change only gradually (as sin 7 0) 
in the polar direction, while oscillating in the azimuthal direction. The states with intennediate values of 
m provide a crossover between these two extremes, oscillating in both directions, stronger and stronger 
in the direction of cp as \m\ is increased. Still, the magnetic quantum number, surprisingly, does not 
affect the energy for any /. Another surprising feature of the spherical harmonics follows from the 
comparison of Eq. (163) with the second of classical relations (152). These expressions coincide if we 
interpret constant 

L 2 =h 2 l(l + 1), (3.178) 

as the value of the full angular momentum squared L = \ L I (including its both 6 and cp components) in 
the eigenstate with eigenfunction Y/". On the other hand, the structure of the azimuthal component F(<p) 
of the wavefunction is exactly the same as in 2D axially- symmetric problems, suggesting that Eq. (139) 
still gives correct values (in our new notation, L : = mh) for the z-component of the angular momentum. 
If this is so, why for any state with / > 0, ( L z ) = m~ti < / fr is less than L = 1(1 + l)h ? In other words, 
what prevents the angular momentum vector to be fully aligned with axis z? 

Besides that issue, though the above analysis of the spherical rotator is formally 
(mathematically) complete, it is as unsatisfactory on the physics level as the harmonic oscillator analysis 
in Sec. 2.6. In particular, it does not explain the meaning of the extremely simple relations for 
eigenvalues of energy and angular momentum on the backdrop of rather complicated wavefunctions. 

We will obtain natural answers to all these questions and concerns in Sec. 5.6, but now let us 
complete our survey of wave mechanics by extending it to 3D motion in an arbitrary spherically- 
symmetric potential (155). In this case we have to use the full form of the Laplace operator in spherical 
coordinates. The variable separation procedure is an evident generalization of what we have done 
before, with the particular solution 

<ls = Z(p)®(0)F(<p), (3.179) 


whose substitution into the stationary Schrodinger equation yields 


h 2 

~ 1 d 

f r id X\ 

1 1 d | 

Line*' 

1 1 d 2 F 

2mr 2 

'F dr 

l dr j 

0 sin 6 d6 1 

V dO j 

sin 2 6 F dcp 2 


+ U(r) = E. 


(3.180) 


64 The letter names for states with different values of l stem from the history of optical spectroscopy - for 
example, letter “s”, used for / = 0, originally denoted the “sharp” optical line series, etc. The sequence of the 
letters is as follows: s, p, d,fg, h, and further in the alphabetical order. 
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It is evident that the angular part (the two last terms in square brackets) separates from the radial 
part, and for the former part we get Eq. (156) again, with the only change, R — > r. This change does not 
affect the fact that the eigenfunctions of that equation are the spherical harmonics (171), and the angular 
eigenenergy is given by Eq. (163), again with the replacement R — » r. This means that for the radial 
function, Eq. (180) gives the following equation, 


h 2 

"l d 

f„2 

2 mr 2 

is dr 

l dr ) 


/(/ + 1) 


+ U(r) = E . 


(3.181) 


Note that no information about the magnetic quantum number m has not crept into the radial equation 
(besides establishing the limitation (162) for possible values of /), so that this equation depends only on 
the latter quantum number. 


The radial equation becomes rather simple for U(r) = 0, and may be used, for example, to solve 
the eigenproblem for the free 3D motion of a particle inside the sphere of radius R. Leaving that problem 
for the reader’s exercise, I will proceed to the most important Bohr atom problem, i.e. of motion in the 
so-called attractive Coulomb potential 65 


Attractive 

Coulomb 

potential 


The natural scales of r and E are, respectively, 66 


U(r) = - — , with C > 0. 
r 


(3.182) 


r„ = — — and E n = — 


mC 


mr: 


= m 


-T 

yh) 


(3.183) 


In the normalized units s= E/Eq and c = rhy, Eq. (181) looks simpler, 


d 2 C 2 d'K 


1(1 + IK + 2 


£ + — 


s. 


5 ^ = 0 , 


(3.184) 


but unfortunately its eigenfunctions may be called elementary only in the most generous meaning of the 
word. With the adequate normalization, 

oo 

KA E‘‘'-=S nn ; (3.185) 

0 


these (mutually orthogonal) functions may be presented as 


65 Historically, the solution of this problem in 1928, that reproduced the main result (1.8)-(1.9) of the “old” 
quantum theory developed by N. Bohr in 1912, without its restrictive assumptions, was the decisive step for the 
general acceptance of Schrodinger’s wave mechanics. 

66 These two scales are obtained from relations Eq = ti 2 /mr 0 2 = C/r 0 , i.e. from the equality of the natural scales of 
the potential and kinetic energies, dropping all numerical coefficients. For the most important case of the 
hydrogen atom, C = e 2 /4;r£b, these scales are reduced, respectively, to the Bohr radius r B (1.13) and the Hartree 
energy E ti (1.9). Note also that for a hydrogen-like atom (or rather ion), with C = Z(e 2 /4^£o), these two key 
parameters are rescaled as r 0 = r B lZ, E 0 = Z 2 E U . 
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(3.186) 
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Here L q (g) are the so-called associated Laguerre polynomials, which may be calculated as 


Associated 
(3.187) Laguerre 

polynomials 

from simple Laguerre polynomials L p ( c) = L p °(f). 61 In turn, the easiest way to obtain L p ( c) is to use the 
following Rodrigues formula : 68 

Rodrigues 

tl ioo\ f° rmu l a f° r 
(i.loo) Laguerre 

polynomials 


LAZ) = e 


£ d p 
dc' 




Z P e 



Notice that in contrast with the associated Legendre functions Pf participating in spherical harmonics, 
L/ are just polynomials, and those with small indices p and q are indeed simple. 


Returning to Eq. (186), we see that the natural quantization of the radial equation (184) has 
brought us a new quantum number (integer) n. In order to understand its range, we should notice that 
according to Eq. (188), the highest power of terms in polynomial L p+q is (p + q), and hence, according to 
Eq. (187), that of Lf is p, so that of the highest power in the polynomial participating in Eq. (186) is (n 
-1-1). Since the power cannot be negative (to avoid the unphysical divergence of wavefunctions at r 
— > 0), the radial quantum number n has to obey the restriction n > l + 1 . Since /, as we already know, 
may take values / = 0, 1, 2,. . ., we may conclude that n may only take values 


n = 1, 2,... 


(3.189) 


What makes this relation important is the following, most surprising result of the theory: the 
eigenenergies corresponding to wavefunctions (179), which are indexed with 3 quantum numbers: 


¥nj.,n = ^ n Ar)Y l m {0,(p), 


depend only on n and agree with Bohr’s formula (1.8): 


2/r 


2/7 - 


- m 


-T 

yh ) 


(3.190) 


(3.191) 


Because of this reason, n is usually called the principal quantum number, and the above relation 
between it and “more subordinate” / is rewritten as 


l <n — \. 


(3.192) 


Together with inequality (162), this gives us the most important hierarchy of the 3 quantum 
numbers involved in the problem: 


1<«<oo => 0</</7-l => -/< m < +/, 


(3.193) 


Bohr 

atom’s 

quantum 

numbers 


67 In Eqs. (1 87)-( 1 88), p and q are non-negative integers, with no relation whatsoever to particle’s momentum or 
electric charge. Sorry for this notation, but it is absolutely common, and can hardly result in any confusion. 

68 Named after the same B. O. Rodrigues, and belonging to the same class as his another key result, Eq. (165). 


Chapter 3 


Page 42 of 56 


Essential Graduate Physics 


QM: Quantum Mechanics 


Ground 

state’s 

radial 

function 


Taking into account the (21 +l)-degeneracy related to the magnetic number m, and using the well-known 
formula for the arithmetic progression, 69 we see that each energy level (191) has the following orbital 
degeneracy. 

n — 1 n—\ n — 1 

g = £( 2/ + l) = 2£/- £l=2 

1=0 1=0 1=0 

Due to its importance for applications, let us spell out the quantum number hierarchy of a few lowest- 
energy states, using the traditional notation in which the value of n is followed by the letter that denotes 
the value of /: 


i(n - 1) 


- n = n 


(3.194) 


n = 1 : 1 = 0 

(one Is state) 

m = 0 . 

(3.195) 

n = 2: / = 0 

(one 2s state) 

m = 0, 

(3.196) 

/ = 1 

(three 2 p states) 

m = 0, ± 1 . 

n = 3 : 1 = 0 

(one 3s state) 

m = 0, 


1 = 1 

(three 3 p states) 

m = 0, ± 1, 

(3.197) 

1 = 2 

(five 3 d states) 

m = 0, ± 1, ± 2. 



Figure 20 shows plots of the radial functions (186) of the listed states. The most important of 
them is of course the ground (Is) state with n = 1 and hence E = - EJ2, whose radial function (186) is 
just 



(3.198) 


and the angular distribution is unifonn - see Eq. (174). The gap between the ground energy and the 
energy E = - Eq/8 of the lowest excited states (with n = 2) in a hydrogen atom (in which Eq = Eu~ 21 2 
eV) is as large as ~ 10 eV, so that their thermal excitation requires temperatures as high as ~10 5 K, and 
the overwhelming part of all hydrogen atoms in the visible Universe are in their ground state. Since 
atomic hydrogen makes up about 75% of the “normal” matter, we are very fortunate that such simple 
formulas as Eqs. (174) and (198) describe the atomic states most frequently met in Mother Nature! 70 


The radial functions of the next states, 2s and 2 p, are also not too complex: 
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3/2 3 1/2 r n 


(3.199) 


(Note again that the former of these states (2s) can only have a uniform angular distribution, while three 
2 p states have different values of m = 0, ±1, and hence have different angular distributions - see Eq. 
(175) and the second row of Fig. 19.) The most important trend here is a larger radius of decay of the 
exponent (2ro for n = 2 instead of /y for n = 1), and hence the radial extension of the states. This trend is 
confirmed by the following general formula: 71 


69 See, e.g., MA Eq. (2.5a). 

70 Forgetting for a minute about such new “dark clouds” on the horizon of the modem physics as the hypothetical 
dark matter and dark energy. 

71 Note that even at the largest value of /, equal to ( n -1), term 1(1 + 1) in Eq. (200) cannot compensate term 3 n 1 . 
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('■)„, = y[3n 2 -/(/ + l)]. (3.200) 

The second important trend is that at fixed n, the orbital quantum number / detennines how fast 
does the wavefunction change with r near the origin, and how much it oscillates in the radial direction at 
larger r. For example, the 2s eigenfunction '<2,o0d is nonvanishing at r = 0, and makes one “wiggle” (has 
one root) in the radial direction, while eigenfunctions 2 p equal zero at r = 0, and do not oscillate at all in 
the radial direction. Instead, those wavefunctions always oscillate as functions of some angle - see the 
second row of Fig. 19. The same trend in clearly visible for n = 3 (see Fig. 20), and continues for the 
higher values of n. 



r/r 0 


The interpretation of these results is that the states with / = / max = n - 1 may be viewed as analogs 
of the circular motion of a particle in a plane whose orientation defines the quantum number m, with an 
almost fixed radius r « r 0 (n~ ± n). On the other hand, the best classical image of an 5-state (/ = 0) is the 
purely radial motion of the particle to and from the attracting center. (The latter image is especially 
imperfect, because the motion would need to happen simultaneously in all radial directions.) The 
classical language becomes reasonable only for the so-called Rydberg states, with n » 1 , whose linear 
superpositions may be used to compose wave packets closely following the classical, circular or elliptic 
trajectories of the particle - just as was discussed in Sec. 2.2 for the free ID motion. 
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Besides Eq. (200), mathematics gives us several other simple relations for the radial functions 
E n ,i (and, since the spherical harmonics are nonnalized to 1, for the eigenfunctions as the whole), 
including those that we will use later in the course: 72 

= I /_!_ 

, n'(l + \l2)ry V 


n,l 


n 3 l(l + 1/ 2)(/ + l)r 0 3 


(3.201) 



In particular, the first of them means that for any eigenfunction with all its complicated radial and 
angular dependencies, there is a simple relation between the potential and full energies: 

( u )„ = - c (-) =-^-=~^= 2E - < 3 - 202 > 
V/nj n r 0 n 

so that the average kinetic energy of the particle, ( T) n j = E„ - ( U) n ,i , is equal to \E n \> 0. 

These simple results are in a sharp contrast with the rather complicated expressions for the 
eigenfunctions, and motivate a search for more general methods of quantum mechanics, which would 
replace or at least complement our brute-force (wave-mechanics) approach, to reveal their real nature. 
Such an approach will be the main topic of the next chapter. 


3.7. Atoms 

Before proceeding to that chapter, let me show that, rather strikingly, the classification of 
quantum numbers in the simple potential well ( 1 82), carried out in the last section, together with very 
modest borrowings from the further theory, allows an semi-quantitative explanation of the whole system 
of chemical elements. The “only” two additions we need are the following facts: 

(i) due to interaction with relatively low-temperature environments, atoms tend to relax into their 
lowest-energy state, and 

(ii) due to the Pauli principle (valid for electrons as Fermi particles), each orbital eigenstate 
discussed above can be occupied with 2 electrons with opposite spins. 

Of course, atomic electrons do interact, so that their quantitative description requires quantum 
mechanics of multiparticle systems, which is rather complex. (Its main concepts will be discussed in 
Chapter 8.) However, the lion’s share of this interaction reduces to simple electrostatic screening, i.e. 
the partial compensation of the electric charge of the atomic nucleus, as felt by a particular electron, by 
other electrons of the atom. This screening changes the qualitative results (such as the energy scale) 
dramatically; however, the quantum number hierarchy, and hence their classification, is not affected. 

The system of atoms is most often presented as the famous periodic table of chemical elements , 73 
whose simple version is shown in Fig. 21, while Fig. 22 presents a sequential list of the elements with 
their electron configurations. The numbers in table’s cells (and the first column in the list) are the 


72 The first of these relations may be also readily proved using the Heller-Feynman theorem (see Chapter 1); this 
proof is left for reader’s exercise. Note also that the last of the expressions diverges at / = 0, in particular in the 
ground state of the system (with n = \,l= 0). 

73 Also called the Mendeleev table, after D. Mendeleev who put forward the concept of the periodicity of 
chemical element properties as functions of Z phenomenologically in 1869. (The explanation of the periodicity 
had to wait for 60 more years until the quantum mechanics formulation in the late 1920s.) 
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atomic numbers Z, which physically are the numbers of protons in the atomic nucleus, and hence the 
numbers of electrons in the electrically neutral atom. The electron configuration in Fig. 22 follows the 
convention already used in Eqs. (1 95)-( 1 97), with the additional upper index showing the number of 
electrons with the indicated values of quantum numbers n and /. 

The lightest atom, with Z = 1, is hydrogen (chemical symbol H) - the only atom for each the 
theory discussed in Sec. 6 is quantitatively correct. 74 According to Eq. (191), the l.y ground state of its 
only electron corresponds to quantum numbers n = 1, / = 0, and m = 0 - see Eq. (196). In most versions 
of the periodic table, the cell of H is placed in the top left comer. In the next atom, helium (He, Z = 2), 
the same orbital quantum state ( 1 s) is filled with two electrons with different spins. 75 Note that due to 
the twice higher electric charge of the nucleus, i.e. the twice higher value of constant C in Eq. (182), 
resulting in a 4-fold increase of constant Eq (183), the binding energy of each electron is crudely 4 times 
higher than that of the hydrogen atom - though the electron interaction decreases it by about 25% - see 
Sec. 7.2. This is why taking one electron away (i.e. positive ionization) of the helium atom requires a 
very high energy, 23.4 eV, which is not available in usual chemical reactions. On the other hand, a 
neural helium atom cannot bind one more electron (i.e. fonn a negative ion) either. As a result, helium, 
and all other elements with fully completed electron shells (sets of states with eigenenergies well 
separated from higher energy levels) is a chemically inert noble gas, thus starting the whole right-most 
column of the periodic table, committed to such elements. 
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Fig. 21. The periodic table of elements, showing their atomic numbers, as well as their basic 
physicakchemical properties at the so-called ambient (meaning usual laboratory) conditions. 


74 Besides very small “fine-structure” corrections - to be discussed in Chapters 6 and 9. 

75 As will be discussed in detail in Chapter 8, electrons of the same atom are actually indistinguishable, and their 
quantum states are not independent, and frequently entangled. These factors are important for several properties 
of helium atoms (and heavier elements as well), especially for their response to external fields. However, for the 
atom classification purposes, they are not crucial. 
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Ad 1 5 s 1 

40 

Zr 

4d z 5s z 

41 

Nb 

4(?5s x 

42 

Mo 

4d b 5s x 

43 

Tc 

4d b 5s x 

44 

Ru 

4d 2 5s x 

45 

Rh 

4/55 1 

46 

Pd 

4 d w 

47 

Ag 

4 d w 5s l 

48 

Cd 

4d ,0 5s 2 

49 

In 

4d l0 5s 2 5p l 

50 

Sn 

4d X0 5s z 5p z 

51 

Sb 

4d xu 5s 2 5p 3 

52 

Te 


53 

I 


54 

Xe 

4d w 5s 2 5p b 

Period 6 

[Xe] shell, 
plus: 

55 

Cs 

6.S 1 ' 

56 

Ba 

65 2 

57 

La 

5d x 6s 2 

58 

Ce 

4/ i 5d x 6s z 

59 

Pr 

4 f6s 2 


Nd 

mmm 

61 

Pm 

~WEM H* 

62 


MfilM 

63 

Eu 

MfiEM 

64 

Gd 


65 

Tb 

mmm 

66 

Dy 

M 

67 

Ho 

BIESH 

68 

Er 

\mm$m 

69 

Tm 


70 

Yb 

warn* 

71 

Lu 

WfiiBfimM 

72 

Hf 

4/ 14 5/6s 2 

73 

Ta 

wrmum 

74 

W 


75 

Re 


76 

Os 

4f 4 5 d b 6s 2 


Atomic 

number 

Atomic 

symbol 

Electron 

states 

Period 1 


1 

H 

Is 1 

2 

He 

Is 2 

Period 2 

[He] shell, 
plus: 

3 

Li 

25 1 

4 

Be 

25 2 

5 

B 

2s z 2p x 

6 

C 

2s z 2p 2 

7 

N 

2s 1 2p i 

8 

0 

2s z 2p 4 

9 

F 

2s z 2p b 

10 

Ne 

2 s 2 p 

Period 3 

[Ne] shell, 
plus: 

11 

Na 

35 1 

12 

mm 

35 2 

13 

A1 

mem 

14 

Si 

3s z 3p z 

15 

P 

nEsm 

16 

S 


17 

Cl 

3s z 3p 5 

18 

Ar 

3s z 3p b 

Period 4 

[Ar] shell, 
plus: 

19 

K 

45 1 


Ca 

4 5 2 

21 

Sc 

liiZSHf 

22 

Ti 


23 

V 


24 

Cr 


25 


W&SM 

26 

Fe 


27 

Co 


28 

Ni 


29 

Cu 



Zn 

mmum 

31 

Ga 


32 

Ge 

| j^j|j£S£SUi 

33 

As 

mum 

34 

Se 


35 

Br 

2831 

36 

Kr 

3d w 4s 2 4p b 


Atomic 

number 

Atomic 

symbol 

Electron 

states 

77 

Ir 

4f 4 5d 2 6s 2 

78 

Pt 

4f 4 5c?6s x 

79 

Au 

4f 4 5d x %s x 

80 

Hg 

4f 4 5d l %s 2 

81 

Tl 

4f l4 5d x0 6s 2 6p x 

82 

Pb 

4f 4 5d X0 6s 2 6p 2 

83 

Bi 

4f X4 5d X0 6s 2 6p i 

84 

Po 

4f 4 5d X0 6s 2 6p 4 

85 

At 

4f 4 5d X0 6s 2 6p i 

86 

Rn 

4f X4 5d X0 6s 2 6p 6 

Period 7 

[Rn] shell, 
plus: 

87 

Fr 

ls x 

88 

Ra 

Is 1 

89 

Ac 

6d x ls 2 

90 

Th 

6d z ls 2 

91 

Pa 

5f6d x ls z 

92 

U 


93 

Np 


94 

Pu 

5fls z 

95 


5fls z 

96 


5f6d x ls z 

97 

Bk 

5fls z 

98 

Cf 

5?°75 2 

99 

Es 

5/“75 2 

100 

Fm 

5/ i2 75 2 

101 

Md 

5f z ls z 

102 

No 

Sf 4 ls z 

89H- 

Lr 

5f 4 6d x ls 2 

93 

Rf 

5f 4 6d 2 ls z 

105 

Db 

wMsmsum 

106 

Sg 

MSISSSSSM 

107 

Bh 


108 

Hs 


109 

Mt 

WEKMMm 


Ds 


111 

Rg 


112 

Cn 

mSMkjrM 

113 



MEM 

FI 


msmi 

Uup 

mmasnssnm 

mm 

Lv 

wismnssm 

117 

Uus 


118 

Uuo 

5f 4 6d X0 ls z lp b 


Fig. 3.22. Atomic electron configurations. The upper index shows the number of electrons in states with the 
indicated quantum numbers n (the first digit) and / (letter-coded as listed above). 
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The situation changes dramatically as we move to the next element, lithium (Li), with Z = 3 
electrons. Two of them are still accommodated by the inner shell n = 1 (listed in Fig. 22 as the helium 
shell [Fie]), but the third one has to reside in the next shell with n = 2 and / = 0, i.e. in the 2s state. 
According to Eq. (191), the binding energy of this electron is much lower, especially if we take into 
account that according to Eq. (200), the Is electrons of the [He] shell are much closer to the nucleus and 
almost completely compensate two thirds of its electric charge +3e. As a result, the 2s electron is 
reasonably well described by Eq. (199), with binding energy of just 5.39 eV, so that a lithium atom can 
give out that electron rather easily - to either atoms of other elements to form chemical compounds, or 
into the common conduction band of solid state lithium - and as a result it is a typical alkali metal. The 
similarity of chemical properties of lithium and hydrogen, with the chemical valence of one, 76 places Li 
as the starting element of the second period (row), with the first period limited to only H and He. 

In the next element, beryllium (Z = 4), the 2s state (n = 2, / = 0) picks up one more electron, with 
the opposite spin. Due to the higher electric charge of the nucleus, Q = 4e, with only half of it 
compensated by Is electrons of the [He] shell, the binding energy of the 2s electrons is higher than in 
lithium, so that the ionization energy increases to 9.32 eV. As a result, beryllium is also chemically 
active but not as active as lithium, with the valence of two, and is also is metallic in its solid state phase, 
but does not conduct electric current as well as lithium. 

Moving in this way along the second row of the periodic table (from Z = 3 to Z = 10), we see the 
gradual filling of all 4 different orbital states of the n = 2 shell, by 2 electrons each, with gradually 
growing ionization potential (up to 21.6 eV in Ne with Z = 10), i.e. the growing reluctance to have 
metallic conductance or form positive ions. However, the final elements of the row, such as oxygen (O, 
with Z = 8) and especially fluorine (F, with Z = 9) can readily pick up extra electrons to fill their 2 p 
states, i.e. fonn negative ions. As a result, these elements are chemically active, with the double valence 
for oxygen and single valence for fluorine. However, the final element of this row, neon, has its n = 2 
shell full, and cannot fonn a stable negative ion. This is why it is a noble gas, like helium. Traditionally, 
in the periodic table it is placed right under helium (Fig. 21), to emphasize the similarity of their 
chemical and physical properties. But this necessitates making an at least 6-cell gap in the 1 st row. 
(Actually, the gap is often made larger, to accommodate next rows - keep reading.) 

Period 3, i.e. the 3 rd row of the table starts exactly like period 2, with sodium (Na, with Z = 11), 
also a chemically active alkali metal whose atom features 10 electrons filling shells with n = 1 and n = 2 
(in Fig. 22 collectively called the neon shell, [Ne]), plus one electron in a 3s state (n = 3, / = 0, m = 0), 
which may be reasonably well described by the hydrogen atom theory - see, e.g., the red trace on the 
last panel of Fig. 20. Naively we could expect that, according to Eq. (194), and with the account of 
double spin degeneracy, this period of the table should have 2 n = 2x3" = 18 elements, with gradual 
filling of two 3s states, six 3 p states, and ten 3d states. However, here we run into a big surprise: after 
argon (Ar, with Z = 18), a relatively inert element with ionization energy of 15.7 eV due to the fully 
filled 3s and 3 p states, the next element, potassium (K, with Z = 19) is an alkali metal again! 

The reason for that is the difference of the actual electron energies from those of the hydrogen 
atom, which is due mostly to inter-electron interactions and gradually accumulates with the growth of Z. 
It may be semi-quantitatively understood from the results of Sec. 6. In hydrogen-like atoms, electron 
state energies do not depend on the quantum number / (as well as m) - see Eq. (191). However, the 


76 Chemical valence is a relatively vague term describing the number of atom’s electrons involved in chemical 
reactions. For the same atom, the number may depend on the chemical compound formed. 
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orbital quantum number does affect the wavefunction of an electron. As Fig. 20 shows, the larger / the 
less the probability for an electron to be close to the nucleus, where its positive charge is less 
compensated by other electrons. As a result of this effect (and also the relativistic corrections to be 
discussed in Sec. 6.3), electron’s energy grows with /. Actually, this effect was visible even in period 2: 
it manifests itself in the filling order (p states after s states). However, for potassium (K, with Z = 19) 
and calcium (Ca, with Z = 20), energies of 3d states become so high that energies of two 4s states (with 
opposite spins) are lower, and they are filled first. As described by factor 3 in the square brackets of Eq. 
(200), and also by Eq. (201), the effect of the principal number n on the distance from the nucleus is 
stronger than that of / < n, so that 4s wavefunctions of K and Ca are relatively far from the nucleus, and 
determine the chemical valence (equal to 1 and 2, correspondingly) of these elements. The next atoms, 
from Sc (Z = 21) to Zn (Z = 30), with the gradually filled “internal” 3d states, are the so-called 
transition metals whose (comparable) ionization energies and chemical properties are determined by 4s 
electrons. 

This fact is the origin of the difference between various forms of the “periodic” table. In its most 
popular option, shown in Fig. 21, K is used to start the next, period 4, and then a new period is started 
each time and only when the first electron with the next principal quantum number (n) appears. 77 This 
topology provides a very clear mapping on the chemical properties of the first element of each period 
(an alkali metal), as well as its last element (a noble gas). This also automatically means making gaps in 
all previous rows. Usually, this gap is made between the atoms with completely filled 5 states and with 
the first electron in a p state, because here the properties of the elements make a somewhat larger step. 
(For example, the step from Be to B makes the material an insulator, but it is not large enough to make a 
similar difference between Mg to Al.) As a result, elements of the same column have approximately 
similar chemical valence and physical properties. 

However, to accommodate longer lowest rows, such presentation is inconvenient, because the 
whole table would be too broad. This is why the so-called rare earths, including lanthanides (with Z 
from 57 to 70, of the 6 th row, with gradual filling of 4 f and 5 d states) and actinides (Z from 89 to 103, of 
the 7 th row, with gradual filling of 5/and 6 d states), are presented as outlet lines (Fig. 21). This is quite 
acceptable for the purposes of standard chemistry, because chemical properties of elements within each 
group are rather close. 

To summarize, the “periodic table of elements” is not periodic in the strict sense of the word. 
Nevertheless, it has had an enormous historic significance for chemistry, as well as atomic and solid 
state physics, and is still very convenient for many purposes. For our course, the most important aspect 
of its discussion is the surprising possibility to describe, at least for classification purposes, such a 
complex multi-electron system as an atom as a set of quasi-independent electrons in certain quantum 
states indexed with the same quantum numbers n, 1, and m as those of the hydrogen atom. This fact 
enables the use of various perturbation theories, which give more quantitative description of atomic 
properties. Some of these techniques will be reviewed in Chapters 6 and 8 of this course. 78 


77 Another option is to return to the first column as soon an atom has one electron in ,v state (like it is in Cu, Ag, 
and Au, in addition to the alkali metals). 

78 For a bit more detailed (but still very succinct) discussion of valence and other chemical aspects of atomic 
structure, I can recommend Chapter 5 of the classical text by L. Pauling, General Chemistry ?, Dover, 1988. 
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3.8. Exercise problems 


3.1 . A particle of energy E is incident (in Fig. on the right, within the 
plane of drawing) on a sharp potential step: 

f 0, for x < 0, 

U (r) = 

\U n , for 0 < x . 


Find the particle reflection probability R as a function of the incidence angle 6 ; 
sketch and discuss the function, for different magnitudes and signs of Uq. 



3.2. Use the finite difference method with step h = a / 2 to calculate as many eigenenergies as 
possible, for a free particle confined to the interior of: 

(i) a square with side a; 

(ii) a cube with side a. 

For the square, repeat the calculations, using a finer step: h = at 3. Compare the results for different h , 
with the exact fonnula. 

Hint : It is advisable to first solve (or review the solution of :-) the similar ID problem in Chapter 
1, or start from reading about the finite difference method. 79 Also, try to exploit problem’s symmetry. 


3.3 . Use the variational method to estimate the ground state energy of a particle of mass m, 
moving in a spherically-symmetric potential 

U(r) = ar 4 . 


3.4 . In the classical version of the Fandau level problem discussed in Sec. 2, the center of 
particle’s orbit is an integral of motion, detennined by initial conditions. Calculate the commutation 
relations between the quantum-mechanical operators corresponding to the Cartesian coordinates of the 
center, and to the sum of their squares. 

3.5 . * Analyze how are the Fandau levels (3.50) modified by an additional constant electric field 
E, directed along the particle plane. Contemplate 
the physical meaning of your result, and its 
implications for the quantum Hall effect in a 
gate-defined Hall bar. (The area LxW area of 

such a bar [see Fig. 3.6 of the lecture notes] is 

defined by metallic gate electrodes parallel to § as P^ ane semiconductor 

the 2D electron gas plane - see Fig. on the right. 

The negative voltage V g , applied to the gates, chases the electrons gas out of the confinement plane at 
the remaining sample area.) 

3.6 . Analyze how are the Fandau levels (50) modified if a 2D particle is confined in an 

2 2 

additional ID potential well U(x) = ma>o x /. 2 
79 See, e.g., CM Sec. 8.5 or EM Sec. 2.8. 
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3.7 . Find the eigenfunctions of a spinless, charged 3D particle moving in “crossed” 
(perpendicular), uniform electric and magnetic fields. For each eigenfunction, calculate the expectation 
value of particle’s velocity in the direction perpendicular to both fields, and compare the result with the 
solution of the corresponding classical problem. 

Hint : Generalize Landau’s solution for 2D particles, discussed in Sec. 2. 


3.8 . Use the Bom approximation to calculate the angular dependence and the full cross-section 
of scattering of an incident plane wave, propagating along axis x, by the following pair of point 
inhomogeneities: 


U( r) = W 


S 


a ^ 
■” z 2y 


f 




V 


o' 
r + n_ — 

2j_ 


Analyze the results in detail. Derive the condition of the Born approximation’s validity for such delta- 
functional scatterers. 


3.9 . Use the Bom approximation to calculate the differential and full cross-sections of a spherical 
scatterer: 

\U a , for r < R, 

U(r) = \ °’ 

[ 0, otherwise. 

Analyze both results, especially the angular dependence of da/dCl, in detail, for kR « 1 and kR »1. 


3.10 . Use the Bom approximation to calculate differential and full cross-sections of electron 
scattering by a screened Coulomb field of a point charge Ze, with electrostatic potential 

)- Ze e ~ Xr 
4 7T£ 0 r 

neglecting the spin interaction effects, and analyzed their dependence on the screening parameter X. 
Compare the results with those given by the classical (“Rutherford”) formula 80 for the unscreened 
Coulomb potential (X — » 0), and formulate the condition of Bom approximation’s validity in this limit. 


3.11. A quantum particle of mass m with electric charge Q is scattered by a localized distributed 
charge with a spherically-symmetric density p(r) and zero total charge. Use the Born approximation to 
calculate the differential cross-section of forward scattering (with scattering angle 6= 0), and evaluate it 
for scattering of electrons by a hydrogen atom in its ground state. 

3.12 . Reformulate the Bom approximation for the ID case. Use the result to find the scattering 
and transfer matrices of a “rectangular” scatterer 

£ io AA <dl2 ’ 

\ 0, otherwise. 


80 See, e.g., CM Sec. 3.7, in particular Eq. (3.72). 
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Compare the results with the those of the exact calculations carried out earlier in the course. 

3.13 . Use Eq. (88) to show that the Bragg rule for the diffraction wave maxima, k = ko + Q, 
where Q is any vector of the reciprocal lattice defined by Eq. (110), is valid not only for 
electromagnetic waves, but also for nonrelativistic quantum particle scattering by a periodic (Bravais) 
lattice. 


3.14 . In the tight-binding approximation, calculate the eigenstates and eigenvalues of three 
similar, weakly coupled quantum wells located in the vertices of an equilateral triangle. 

3.15 . Figure on the right shows a fragment of a periodic 2D lattice, with V 
open and solid points showing the location of different local potentials - say, 
different atoms. 

(i) Find the reciprocal lattice and the 1 st Brillouin zone; 

(ii) Find wave number k of the monochromatic radiation incident along 
axis x, at which the lattice creates the first-order diffraction peak within the \x, v] 
plane, and the direction towards this peak. 

(iii) Semi-qualitatively, describe the evolution of the intensity of the peak if the local potentials 
represented by the open and solid points tend to each other. 

3.16 . For the 2D hexagonal lattice (Fig. 1 lb): 

(i) find the reciprocal lattice Q and the 1 st Brillouin zone; 

(ii) use the tight-binding approximation to calculate the dispersion relation E'(q) for a 2D particle 
moving in a potential with such periodicity, close to the eigenenergy of an axially-symmetric state 
quasi-localized at the potential minima; 

(iii) analyze and sketch (or plot) the resulting dispersion relation fs(q) inside the 1 st Brillouin 

zone. 



3.17 . * Complete the tight-binding approximation calculation of band structure of the honeycomb 
lattice, started in the end of Sec. 4. Analyze the results. Prove that the Dirac points qo are located in the 
comers of the 1 st Brillouin zone, and express the velocity v„, participating in Eq. (122), in terms of the 
coupling energy S„. Show that the final results do not change if the quasi-localized wavefunctions are 
not axially-symmetric, but are proportional to cxp\ i/up] - as they are, with n= 1, for the 2 p : electrons of 
carbon atoms in graphene, which are responsible for its transport properties. 

3.18 . Examine basic properties of the so-called Wannier functions defined as 

(j) K (r) = const x J y/^ (r)e ,qR J 3 g, 

BZ 

where i// (| (r) is the Bloch wavefunction (3.108), R is any vector of the Bravais lattice, and the integration 
over quasi-momentum q is extended over any (e.g., the first) Brillouin zone. 

3.19 . Evaluate the long-range electrostatic interaction (the so-called London dispersion force) 
between two similar, electrically-neutral but polarizable molecules, modeling them as isotropic 3D 
harmonic oscillators. 
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Hint : Using the classical expression for the interaction between two electric dipoles, 81 try to 
present the total Hamiltonian of the system as a sum of Hamiltonians of several independent harmonic 
oscillators, and calculate their ground-state energy as a function of distance between the molecules. 

3.20 . Use the variable separation method to find expressions for the eigenfunctions and the 
corresponding eigenenergies of a free 2D particle confined inside a thin round disk of radius R : 

fO, for 0 < p < R, 

[ + oo, fori? < p, 

where p = {x, y, 0}. What is the level degeneracy? Calculate 5 lowest energy levels with accuracy better 
than 1%. 

3.21 . Calculate the ground-state energy of a 2D particle localized in a shallow flat-bottom 
potential well 



with 0 < U n « — — r 
niR 2 


3.22 . Spell out the explicit fonn of spherical harmonics Y° ( 6 , cp) and f 4 4 {6, cp) . 


3.23 . Calculate ( x ) and (x 2 ) in the ground state of the planar and spherical rotators of radius R. 

2 

What can you say about averages (p x ) and (p x )? 


3.24 . According to the discussion in the beginning of Sec. 5, eigenfunctions of a 3D harmonic 
oscillator may be calculated as products of three ID “Cartesian oscillators” - see, in particular Eq. (124), 
with d = 3. However, according to the discussion in Sec. 3.6, wavefunctions of the type (190), 
proportional to spherical harmonics K/'", are also eigenstates of this spheric ally-symmetric system. 
Represent: 

(i) the ground state of the oscillator, and 

(ii) each of its lowest excited states, 

taken in the fonn (190), as linear combinations of products of ID oscillator wavefunctions. Also, 
calculate the degeneracy of n th energy level of the oscillator. 


2 2 2 1/2 

3.25 . A spherical rotator (with r = (x + y + z ) = R = const) of mass m is in the state with 


wavefunction 


xp = const x 


— + sin 2 # 


V- 


J 


Calculate the system’s energy. 


3.26 . Calculate the eigenfunctions and the energy spectrum of a 3D particle free to move inside 
a sphere of radius R: 


81 See, e.g., EM Sec. 3.1. 
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fO, forO<r<i?, 

U = \ 

[ + oo, for R < r. 

Calculate 5 lowest energy levels with a 1% accuracy, and indicate the degeneracy of each level. 

Hint : The solution of this problem requires the so-called spherical Bessel functions jiic), whose 
description is available in most math handbooks. 82 

3.27 . Find the smallest value of depth Co for that the spherical quantum well 

u _l~U 0 , for r < R, 

| 0, for R < r, 

has a bound (localized) eigenstate. Does such a state exist for a very narrow and deep well U = - 
with a positive and finite 

3.28 . Calculate the smallest value of depth (J„ for that the following spherically-symmetric 
quantum well, 

U(r ) = ~U 0 e~ r/R , with U 0 , R > 0 , 
has a bound (localized) eigenstate. 

Hint : Try to introduce the following new variables: / = rR and £ = Ce r 2R , with an appropriate 
choice of constant C. 

3.29 . Calculate the lifetime of the lowest metastable state in the spherical-shell potential 

U(r ) = 'WS(r - R), with K) > 0, 

in the limit of large Specify the limit of validity of your result. 

3.30 . Calculate the condition at which a particle of mass m, moving in the field of a very thin 
spherically-symmetric shell, with 

U (r) = -'Uid{r ~ R), with 'Ui > 0 , 

has at least one localized (“bound”) stationary state. Compare the result with that for potential 

U 0 (r) = with 'f, > 0. 

Hint : Note that the first delta- function is one-dimensional, while the second one is three- 
dimensional, so that parameters 'VJ and ‘ri {) have different dimensionalities. 

3.31 . A particle, moving in a central potential U(r), with U(r) — > 0 at r — » oo, has a stationary 
state with the following wavefunction: 

y/ = Cr a e ^ ' cos 0, 

where C, a, and f:l are constants. Calculate: 


82 See, e.g., any of the handbooks recommended in MA Sec. 16(ii). 
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(i) probabilities of all possible values of quantum numbers m and /, 

(ii) the confining potential, and 

(iii) state’s energy. 

3.32 . Calculate the energy spectrum of a particle moving in a monotonic, but otherwise arbitrary 
attractive central potential U(r), in the approximation of large orbital quantum numbers /. Formulate the 
quantitative condition(s) of validity of your theory. Check that for the Coulomb potential U(r ) = -C/r, 
your result agrees with Eq. (191). 

3.33 . An electron had been in the ground state of a hydrogen-like atom/ion with nuclear charge 
Ze, when the charge suddenly changed to (Z + I )eP Calculate the probabilities for the electron of the 
changed system to be: 

(i) in the ground state, and 

(ii) in the lowest excited state. 

Evaluate these probabilities for the particular case of the beta decay of tritium, with the formation of a 

'X 

single-positive ion of He. 

2 2 * 

3.34 . Calculate (x ) and {p x ) in the ground state of a hydrogen-like atom. Compare the results 
with Heisenberg’s uncertainty relation. What do these results tell about electron’s velocity in the atom? 

3.35 . Apply to Eq. (181) the Hellmann-Feynman theorem (see Problem 1.4) to prove: 

(i) the first of Eqs. (3.201), and 

(ii) the fact that for a spinless particle in an arbitrary spherically-symmetric attractive potential 
U(r), the ground state is always an 5-state (with the orbital quantum number / = 0). 

3.36 . For the ground state of a hydrogen atom, calculate the expectation values of £ and Z , 

where is the electric field created by the atom at distance r » ro from its nucleus. Interpret the 

2 2 

resulting relation between {&) and (Z ) (at the same observation point). 


83 Such a fast change happens, for example, at the beta-decay, when one of nucleus’ neurons suddenly becomes a 
proton, emitting a high-energy electron and a neutrino which leave the system very fast (instantly on the atomic 
time scale), and do not participate in the atom transition’s dynamics. 
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Chapter 4. Bra-ket Formalism 

The objective of this chapter is a discussion of Dirac ’s bra-ket formalism of quantum mechanics, which 
not only overcomes some inconveniences of wave mechanics, but also allows a natural description of 
such “internal” properties of particles as their spin. In the course of discussion of the formalism I will 
give several simple examples of its use, leaving more involved applications for the following chapters. 


4.1. Motivation 

We have seen that wave mechanics gives many results of primary importance. Moreover, it is 
fully (or mostly) sufficient for many applications, for example, for solid state electronics and device 
physics. However, in the course of our survey we have filed several grievances about this approach. Let 
me briefly summarize these complaints: 

(i) Wave mechanics is focused on the spatial dependence of wavefunctions. On the other hand, 
our attempts to analyze the temporal evolution of quantum systems within this approach (beyond the 
trivial time behavior of the eigenfunctions, described by Eq. (1.61)), run into technical difficulties. For 
example, we could derive Eq. (2.159) describing time dynamics of the metastable state, or Eq. (2.185) 
describing quantum oscillations in coupled wells, only for the simplest potential profiles, though it is 
intuitively clear that these simple results should be common for all problem of this kind. Deriving the 
equations of such processes for arbitrary potential profiles is possible using perturbation theories (to be 
reviewed in Chapter 6), but that in the wave mechanics language they would require very bulky 
formulas. 


(ii) The same is true concerning other issues that are conceptually addressable within wave 
mechanics, e.g., the Feynman path integral approach, description of coupling to environment, etc. 
Addressing them in wave mechanics would lead to formulas so bulky that I had (wisely :-) postponed 
them until we have got a more compact fonnalism on hand. 

(iii) In the discussion of several key problems (for example the harmonic oscillator and 
spherically-symmetric potentials) we have run into rather complicated eigenfunctions coexisting with 
simple energy spectra - that infer some simple background physics. It is very important to get this 
physics revealed. 

(iv) In the wave mechanics postulates, fonnulated in Sec. 1.2, quantum mechanical operators of 
the coordinate and momentum are treated very unequally - see Eqs. (1.26b). However, some key 
expressions, e.g., for the fundamental eigenfunction of a free-particle, 


exp</ 


;P r 


or the harmonic oscillator’s Hamiltonian, 


2 i m G) 0 „ 2 


H = p + 

2m 2 


(4.1) 


(4.2) 


invite a similar treatment of momentum and coordinate. 


© 2013-2016 K. Likharev 





Essential Graduate Physics 


QM: Quantum Mechanics 


However, the strongest motivation for a more general formalism comes from wave mechanics’ 
conceptual incapability to describe elementary particles’ spins and other internal quantum degrees of 
freedom, such as quark flavors or lepton numbers. In this context, let us review the basic facts on spin 
(which is a very representative and experimentally the most accessible of all internal quantum numbers), 
to understand what a more general formalism should explain - as a minimum. 

Figure 1 shows the conceptual scheme of the simplest spin-revealing experiment, first carried 
out by O. Stern and W. Gerlach in 1922. 1 A collimated beam of electrons is passed through a gap 
between poles of a strong magnet, where the magnetic field 3 , whose orientation is taken for axis z in 
Fig. 1, is non-uniform, so that both 3 Z and d3 : /dz are not equal to zero. As a result, the beam splits into 
two parts of equal intensity. 


collimator 



Fig. 4.1. The simplest Stem- 
Gerlach experiment. 


This simplest experiment can be semi-quantitatively explained on classical, though somewhat 
phenomenological grounds by assuming that each electron has an intrinsic, permanent magnetic dipole 
moment m. Indeed, classical electrodynamics 1 2 tells us that the potential energy U of a magnetic dipole 
in an external magnetic field is equal to (-m • 3 ), so that the force acting on the particle, 

F = -VC/ = -V(- m • 3 ), (4.3) 

has a nonvanishing vertical component 

F, = — —(-m, -3_) = m, — (4.4) 

8z oz 

Hence if we further postulate the existence of two possible, discrete values of m 7 = ±p, this 
explains the Stern-Gerlach effect qualitatively, as a result of the incident electrons having a random 
sign, but similar magnitude of m z . A quantitative explanation of the beam splitting angle requires the 
magnitude of // to be equal (or close) to the so-called Bohr magneton 3 


magneton 


As we will see below, this value cannot be explained by any internal motion of the electron, say its 
rotation about axis z. 


= — *0.9274x10 23 -. 
2 m e T 


1 To my knowledge, the concept of spin as an internal rotation of a particle was first suggested by R. Kronig, then 
a 20-year-old student, in January 1925, a few months before two other students, G. Uhlenbeck and S. Goudsmit - 
to whom the idea is usually attributed. The concept was then accepted and developed quantitatively by W. Pauli. 

2 See, e.g., EM Sec. 5.4, in particular Eq. (5.100). 

3 A convenient mn emonic rule is that it is close to 1 K/T. In the Gaussian units, /j b = he/2m e c * 0.9274x1 O' 20 . 
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Much more importantly, this semi-classical language cannot explain the results of the following 
set of multi-stage Stern-Gerlach experiments, shown in Fig. 2 - even qualitatively. In the first of the 
experiments, the electron beam is first passed through a magnetic field oriented (together with its 
gradient) along axis z, just as in Fig. 1. Then one of the two resulting beams is absorbed (or otherwise 
removed from the setup), while the other one is passed through a similar but x-oriented field. The 
experiment shows that this beam is split again into two components of equal intensity. A classical 
explanation of this experiment would require a very unnatural suggestion that the initial electrons had 
random but discrete components of the magnetic moment simultaneously in two directions, z and x. 

However, even this assumption cannot explain the results of the three-stage Stem-Gerlach 
experiment shown on the middle panel of Fig. 2. Here, the previous two-state setup is complemented 
with one more absorber and one more magnet, now with the z-orientation again. Completely counter- 
intuitively, it again gives two beams of equal intensity, as if we have not yet filtered out the electrons 
with m z corresponding to the lower beam, in the first, z-stage. 



The only way to save the classical explanation here is to say that maybe, electrons somehow 
interact with the magnetic field, so that the x-polarized (non-absorbed) beam becomes spontaneously 
depolarized again somewhere between magnetic stages. But any hope for such explanation is ruined by 
the control experiment shown on the bottom panel of Fig. 2, whose results indicate that no such 
depolarization happens. 

We will see below that all these (and many more) results find a natural explanation in the matrix 
mechanics pioneered by W. Heisenberg, M. Born and P. Jordan in 1925. However, the matrix formalism 
is inconvenient for the solution of most problems discussed in Chapters 1-3, and for a time it was 
eclipsed by Schrodinger’s wave mechanics, which had been put forward just a few months later. 
However, very soon P. A. M. Dirac introduced a more general bra-ket formalism, which provides a 
generalization of both approaches and proves their equivalence. Let me describe it. 


Chapter 4 


Page 3 of 42 











Essential Graduate Physics 


QM: Quantum Mechanics 


4,2. States, state vectors, and linear operators 

The basic notion of the general formulation of quantum mechanics is the quantum state of a 
system. 4 To get some gut feeling of this notion, if a quantum state a of a particle may be adequately 
described by wave mechanics, this description is given by the corresponding wavefunction 'F^r, t). 
Note, however, the state as such is not a mathematical object (such as a function), 5 and can participate in 
mathematical formulas only as a “pointer” - e.g., the index of function V F„. On the other hand, the 
wavefunction is not a state, but a mathematical object (a complex function of space and time) giving a 
quantitative description of the state - just as the radius-vector as a function of time is a mathematical 
object describing the motion of a classical particle - see Fig. 3. Similarly, in the Dirac formalism a 
certain quantum state a is described by either of two mathematical objects, called the state vectors : the 
ket-vector \ a) and bra-vector ( a |. 6 

One should be cautions with the term “vector” here. Usual “geometric” vectors are defined in the 
usual geometric (say, Euclidean) space. In contrast, bra- and ket-vectors are defined in abstract Hilbert 
spaces of a given system, 7 and, despite certain similarities with the geometric vectors, are new 
mathematical objects, so that we need new rules for handling them. The primary rules are essentially 
postulates and are justified only the correct description/prediction of all experimental observations their 
corollaries. While these is a general consensus among physicists what the corollaries are, there are many 
possible ways to carve from them the basic postulate sets. Just as in Sec. 1.2, I will not try too hard to 
beat the number of the postulates to the smallest possible minimum, trying instead to keep their physical 
meaning transparent. 



classical mechanics : r (t) 

* 

wave mechanics : either x ¥ a (r , t) or ¥ a (r , t) 
bra - ket formalism : either | a'j or (a | 


Fig. 4.3. Particle’s state and its descriptions. 


(i) Ket-vectors. Let us start with ket-vectors - sometimes called just kets for short. Perhaps the 
most important property of the vectors concerns their linear superposition. Namely, if several ket- 
vectors | aj) describe possible states of a quantum system, then any linear combination ( superposition ) 



o ' 1 

W"’ 

ii 

"s' 

a j) 

, 


(4.6) 


Linear 

superposition 
of ket-vectors 


4 An attentive reader could notice my smuggling term “system” instead of “particle” which was used in the 
previous chapters. Indeed, the bra-ket formalism allows the description of quantum systems much more complex 
than a single spinless particle that is a typical (though not the only possible) subject of wave mechanics. 

5 As was expressed nicely by A. Peres, one of pioneers of the quantum information theory, “quantum phenomena 
do not occur in the Hilbert space, they occur in a laboratory”. 

6 Terms bra and ket were suggested to reflect the fact that pair {ft and I a) may be considered as the set of parts 
of combination ( ft \ a) (see Eq. (1 1) below), which reminds an expression in the usual angle brackets. 

7 The Hilbert space of a given system is defined as the set of all its possible state vectors. As should be clear from 
this definition, it is not advisable to speak about a “Hilbert space of quantum states'". 
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where Cj are any (possibly complex) c-numbers, also describes a possible state of the same system. (One 
may say that vector | a) belongs to the same Hilbert space as all | a/}.) Actually, since ket-vectors are new 
mathematical objects, the exact meaning of the right-hand part of Eq. (6) becomes clear only after we 
have postulated the following rules of summation of these vectors, 

I a j) + \ a f) = \ a f) + \ a j)’ ( 4 - 7 ) 

and their multiplication by c-numbers: 

c j\ a j) = \ a j) c j- ( 4 - 8 ) 


Note that in the set of wave mechanics postulates, statements parallel to (7) and (8) were unnecessary, 
because wavefunctions are the usual (albeit complex) functions of space and time, and we know from 
the usual algebra that such relations are valid. 

As evident from Eq. (6), the complex coefficient c, may be interpreted as the “weight” of state a, 
in the linear superposition a. One important particular case is cj = 0, showing that state a, does not 
participate in the superposition a. By the way, the corresponding tenn of sum (6), i.e. product 


Null-state 

vector 


0 

a , ), 


j r 


(4.9) 


has a special name: the null-state vector. (It is important to avoid confusion between the null-state 
corresponding to vector (9), and the ground state of the system, which is frequently denoted by ket- 
vector |0). In some sense, the null-state does not exist at all, while the ground state does - and frequently 
is the most important quantum state of the system.) 


(ii) Bra-vectors and inner (“scalar”) products. Bra-vectors {a |, which obey the rules similar to 
Eqs. (7) and (8), are not new, independent objects: if a ket-vector | a) is known, the corresponding bra- 
vector (a\ describes the same state. In other words, there is a unique dual correspondence between | a) 
and (a|, 8 very similar (though not identical) to that between a wavefunction 'F and its complex conjugate 
V F*. The correspondence between these vectors is described by the following rule: if a ket-vector of a 
linear superposition is described by Eq. (6), then the corresponding bra-vector is 


Linear 
superposition 
of bra-vectors 


a 




(4.10) 


Inner 

bra-ket 

product 


The mathematical convenience of using two types of vectors, rather than just one, becomes clear 
from the notion of their inner product (also called the short bracket ): 


(p\<*)=(p 


a 


(4.11) 


This is a (generally, complex) 9 scalar, whose main property is the linearity with respect to any of its 
component vectors. For example, if a linear superposition a is described by the ket-vector (6), then 


8 Mathematicians like to say that the ket- and bra-vectors of the same quantum system are defined in two 
isomorphic Hilbert spaces. 

9 This is one of the differences of bra- and ket-vectors from the usual (geometrical) vectors whose scalar product 
is always a real scalar. 


Chapter 4 


Page 5 of 42 


Essential Graduate Physics 


QM: Quantum Mechanics 


(P\ a )='L c Afi\ a i)< < 412 ) 

j 

while if Eq. (10) is true, then 

j 


In plain English, c-numbers may be moved either into, or out of the inner products. 


The second key property of the inner product is 

(a\p) = (p\a)* . 

It is compatible with Eq. (10); indeed, the complex conjugation of both parts of Eq. (12) gives: 


(4.14) 


Inner 

product’s 

complex 

conjugate 


p\ a ) ='Z c j(p\ a j) =H c MW=m- 


(4.15) 


Finally, one more rule: the inner product of the bra- and ket-vectors describing the same state 
(called the norm squared) is real and non-negative, 



(a\a)> 0. 


(4.16) 


State’s 

norm 

squared 


In order to give the reader some feeling about the meaning of this rule: we will show below that if state 
a may be described by wavefunction Tf/r, t), then 


(a|a) = j¥> a tfV>0. 


(4.17) 


Hence the role of the bra-ket is very similar to the complex conjugation of the wavefunction, and Eq. 
(10) emphasizes this similarity. (Note that, by convention, there is no conjugation sign in the bra-part of 
the inner product; its role is played by the angular bracket inversion.) 

(iii) Operators. One more key notion of the Dirac formalism are quantum-mechanical linear 
operators. Just as for the operators discussed in wave mechanics, the function of an operator is the 
“generation” of one state from another: if | a) is a possible ket of the system, and A is a legitimate 
operator, then the following combination, 

A\a), (4.18) 


is also a ket-vector describing a possible state of the system, i.e. a ket-vector in the same Hilbert space 
as the initial vector \a). As follows from the adjective “linear”, the main rules governing the operators is 
their linearity with respect to both any superposition of vectors: 


X‘ 


a i 


V J 


= 'Z c A a J 


(4.19) 


and any superposition of operators: 


( \ 

Z c A- \ a )=H c A\ a )- 

V j ) j 


(4.20) 
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These rules are evidently similar to Eqs. ( 1 .53)-( 1 .54) of wave mechanics. 


The above rules imply that an operator “acts” on the ket-vector on its right; however, a 
combination of the type (a\A is also legitimate and presents a new bra-vector. It is important that, 


Hermitian 

conjugate 

operator 


generally, this vector does not represent the same state as ket-vector (18); instead, the bra-vector 
isomorphic to ket-vector ( 1 8) is 



(4.21) 


A 4 

This statement serves as the definition of the Hermitian conjugate (or “Hermitian adjoint”) H' of the 


Hermitian 

operator’s 

definition 


initial operator A . For an important class of operators, called the Hermitian operators, the conjugation 
is inconsequential, i.e. for them 


= A. 


(4.22) 


(This equality, as well as any other operator equation below, means that these operators act similarly on 
any bra- or ket-vector.) 10 


To proceed further, we need an additional postulate, called the associative axiom of 
multiplication : into any legitimate bra-ket expression, * 11 not including an explicit summation, we may 
insert or remove parentheses (just in the ordinary product of scalars), meaning as usual that the 
operation inside the parentheses is performed first. The first two examples of this postulate are given by 
Eqs. (19) and (20), but the associative axiom is more general and says, for example: 


Long 

bracket 


<4 

A\a) 

M(/4) 

a) = (p\; 

\\a). 


(4.23) 


This equality serves as the definition of the last form, called the long bracket (evidently, also a scalar), 
with an operator sandwiched between a bra-vector and a ket-vector. This definition, when combined 
with the definition of the Hermitian conjugate and Eq. (14), yields an important corollary: 


p\A\a) = (pp\a))= = {a\^\0)* 


(4.24) 


Long 

bracket’s 

complex 

conjugate 


which is most frequently rewritten as 


(a\j\p)' ={/3\£\a). 


(4.25) 


The associative axiom also enables to readily explore the following definition of one more, outer 
product of bra- and ket-vectors: 


10 If we consider c-numbers as a particular type of operators, then according to Eqs. (11) and (21), for them the 
Hermitian conjugation is equivalent to the simple complex conjugation, so that only a real c-number may be 
considered as a particular case of the Hermitian operator (22). 

1 1 Here “legitimate” means “having a clear sense in the bra-ket formalism”. Some examples of “illegitimate” 
expressions: la) A, A (a |, |a) I (f, (a| {fj. Note, however, that the last two expressions may be legitimate if a and / 3 
are states of different systems, i.e. if their state vectors belong to different Hilbert spaces. We will run into such 
tensor products of bra- and ket vectors (sometimes denoted, respectively, as \a)®\ff and («|®(/() in Chapters 6-8. 
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Outer 

bra-ket 

product 



(4.26) 


In contrast to the inner product (12), which is a scalar, this mathematical construct is an operator. 
Indeed, the associative axiom allows us to remove parentheses in the following expression: 


^P){a\\y) = \P){a\y). 


(4-27) 


But the last bra-ket is just a scalar; hence the mathematical object (26) acting on a ket-vector (in this 
case, \ fi) gives a new ket-vector, which is the essence of operator’s action. Very similarly, 


(S |(i /»)(«!) ={s\p){a 


(4.28) 


- again a typical operator’s action on a bra-vector. 

Now let us perfonn the following calculation. We may use the parentheses insertion into the bra- 
ket equality following from Eq. (14), 

(y\a){p\8) = ((8\p){a\y)'f , (4.29) 


to transform it to the following form: 


(r|(|«)(A|)|^> = ((^|(|A)(«|)|r)r- 


(4.30) 


Since this equation should be valid for any vectors (y| and | jB), its comparison with Eq. (25) gives the 
following operator equality 

Outer 
product’s 

(4.31) Hermitian 

conjugate 

This is the conjugate rule for outer products; it reminds rule (14) for inner products, but involves the 
Hermitian (rather than the usual complex) conjugation. 

The associative axiom is also valid for the operator “multiplication”: 

{ab\o) = a{b\<x)\ (P\[ab)=[{P\a)b , (4.32) 


(|«}(^l) t =| P)(a 


showing that the action of an operator product on a state vector is nothing more than the sequential 
action of the operands. However, we have to be rather careful with the operator products; generally they 
do not commute: AB ^ BA . This is why the commutator, the operator defined as 




A,B 

= AB- BA, 

(4 33) Commutator 

is a very useful option. Another similar notion is the anticommutator . 12 


1 

a.b\ 

= AB + BA . 

Anti- 

(4.34) commutator 


Finally, the bra-ket formalism broadly uses two special operators: the null operator 0 defined 
by the following relations: 


12 


Another popular notation for the anticommutator is 


A,B 


it will not be used in these notes. 


Chapter 4 


Page 8 of 42 


Essential Graduate Physics 


QM: Quantum Mechanics 


Null 

operator 


6|a) = 0|«), (a|0 = (a|0, 


(4.35) 


Identity 

operator 


for an arbitrary state a; we may say that the null operator “kills” any state, turning it into the null-state. 
Another elementary operator is the identity operator, which is also defined by its action (or rather 
“inaction” :-) on an arbitrary state vector: 


l\a) = \aj, (a\l = (a 


(4.36) 


4.3. State basis and matrix representation 


Expansion 

over 

basis 

vectors 


Basis 

vectors' 

ortho- 

normality 


While some operations in quantum mechanics may be carried out in the general bra-ket 
formalism outlined above, most calculations are done for specific quantum systems that feature at least 
one full and orthonormal set {u} of states Uj, frequently called a basis. These terms mean that any state 



For the systems that may be described by wave mechanics, examples of the full orthonormal bases are 
represented by any orthonormal set of eigenfunctions calculated in the previous 3 chapters - as the 
simplest example, see Eq. (1.76). 


Due to the uniqueness of expansion (37), the full set of coefficients ctj gives a complete 
description of state a (in a fixed basis {w}), just as the usual Cartesian components A x , A y , and A z give a 
complete description of a usual geometric 3D vector A (in a fixed reference frame). Still, let me 
emphasize some differences between the quantum-mechanical bra- and ket-vectors and the usual 
geometric vectors: 

(i) a basis set may have a large or even infinite number of states Uj, and 

(ii) the expansion coefficients a } may be complex. 

With these reservations in mind, the analogy with geometric vectors may be pushed even further. 
Let us inner-multiply both parts of the first of Eqs. (37) by a bra-vector (u/\ and then transfonn the 
relation using the linearity rules discussed in the previous section, and Eq. (38): 

(uj. | a) = (u r |£ ctj | Uj } = J] a i { u r | u i ) = a r > ( 4 - 39 ) 

j j 


Together with Eq. (14), this means that any of the expansion coefficients in Eq. (37) may be presented 
as an inner product: 

Expansion 
coefficients 
as inner 
products 

these relations are analogs of equalities A/ = n 7 A of the usual vector algebra. Using these important 
relations (which we will use on numerous occasions), expansions (37) may be rewritten as 



\ * / 

\ 

II 

II 

"s 

u j)’ 


(4.40) 
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a) = 


I 


Uj a 


•Z.K 


a 


a = 


Z 


am. 


‘Z(°\K 


(4.41) 


A comparison of these relations with Eq. (26) shows that the outer product defined as 


A, = 

u , ' 

[U , 


J 

} ! 

\ J 



(4.42) 


is a legitimate linear operator. Such an operator, acting on any state vector of the type (37), singles out 
just one of its components, for example, 


A y .|«) = I UjVuj |a^ = (Xj\uj\ , 


(4.43) 


i.e. kills all components of the linear superposition but one. In the geometric analogy, such operator 
“projects” the state vector on its (/ th ) “direction”, hence its name - the projection operator. Probably, the 
most important property of the projection operators, called the closure (or completeness) relation, 
immediately follows from Eq. (41): their sum over the full basis is equivalent to the identity operator: 



(4.44) 


This means in particular that we may insert the left-hand part of Eq. (44) into any bra-ket relation, at any 
place - the trick that we will use again and again. 

Let us see how expansions (37) transform all the notions introduced in the last section, starting 
from the short bra-ket (11) (the inner product of two state vectors): 


(p\ a ) = lL{ u j \p*j a j\ u r 

jj' 



(4.45) 


Besides the complex conjugation, this expression is similar to the scalar product of the usual vectors. 
Now, let us explore the long bra-ket (23): 

{p\A\a) = Y J P*(u j \A\u r )a J , ( 4 - 46 ) 

jj' jJ' 


Here, the last step uses a very important notion of matrix elements of the operator, defined as 





s 

III 

A 

u ) . 

JJ \ J 


J / 


(4.47) 


As evident from Eq. (46), the full set of the matrix elements completely characterizes the operator, just 
as the full set of expansion coefficients (40) fully characterizes a quantum state. The term “matrix” 
means, first of all, that it is convenient to present the full set of Ajp as a square table {matrix), with the 
linear dimension equal to the number of basis states Uj of the system under the consideration, i.e. the size 
of its Hilbert space. 


As two simplest examples, all matrix elements of the null-operator, defined by Eqs. (35), are 
evidently equal to zero (in any basis), and hence it may be presented as a matrix of zeros (the null 
matrix): 


ill 

o 

o o 
o o 



l ) 



(4.48) 


Projection 

operator 


Closure 

relation 


Operator’s 

matrix 

elements 


Null 

matrix 
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while for the identity operator / , defined by Eqs. (36), we readily get 


Identity 

matrix 


I jf =(uj\i\ u r ) = (uj\ u f ) = 5 jr , 

i.e. its matrix (called the identity matrix) is diagonal - also in any basis: 



f 1 0 

'N 


1 = 

0 1 




v 




(4.49) 


(4.50) 


The convenience of the matrix language extends well beyond the presentation of particular 
operators. For example, let us use definition (47) to calculate matrix elements for a product of two 
operators: 


{AB) r =(uj\AB\ Uj ,). (4.51) 

Here we can use Eq. (44) for the first (but not the last!) time, inserting the identity operator between the 
two operators, and then expressing it via a sum of projection operators: 

Matrix 
element 
of an 
operator 
product 

This result corresponds to the standard “row by column” rule of calculation of an arbitrary element of 
the matrix product 


(dB) = (u . 


LfB Uj„ ) = 


AiB\u r ) = Y J (uj |i| Uj.Vuj, \b\uj. ) = X A M’ B /r ■ ( 4 - 52 ) 



( A 
Ai 

A n 

\ 

B n 

B \2 


AB = 

A 21 

A 22 


B 2\ 

B 22 



(4.53) 


Hence the product of operators may be presented (in a fixed basis!) by that of their matrices (in the same 
basis). This is so convenient that the same language is often used to present not only the long bracket, 


Long 
bracket 
as a matrix 
product 




c 


r a 

Ai 

A n 

\ 

' a 

A 2\ 

A 22 


oc 2 

K ••• 



V — ) 


(4.54) 


but even the simpler short bracket: 


Short 
bracket 
as a matrix 
product 



(4.55) 


although these equalities require the use of non-square matrices: rows of (complex-conjugate!) 
expansion coefficients for the presentation of bra-vectors, and columns of these coefficients for the 
presentation of ket-vectors. With that, the mapping of states and operators on matrices becomes 
completely general. 


Now let us have a look at the outer product operator (26). Its matrix elements are just 
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^ a ){P§ ji '=( u i\ a )(P\ u r) = a jP*' 


(4.56) 


These are elements of a very special square matrix, whose filling requires the knowledge of just 2N 
scalars (where N is the basis set size), rather than N~ scalars as for an arbitrary operator. However, a 
simple generalization of such outer product may present an arbitrary operator. Indeed, let us insert two 
identity operators (44), with different summation indices, on both sides of any operator: 



f ^ 


( 

\ 

II 

Ess 

ii 

ZbXb 

A 

Z “/)(“/ 


\ j 7 


v y 

y 


and use the associative axiom to rewrite this expression as 

^=Y\ u j){ u M\ u j){ u A- 

jj' 


(4.57) 


(4.58) 


But the expression in the middle long bracket is just the matrix element (47), so that we may write 


A = Y\ u j) A jj{ u y[ 

j>f 


(4.59) 


The reader has to agree that this formula, which is a natural generalization of Eq. (44), is extremely 
elegant. Also note the following parallel: if we consider the matrix element definition (47) as some sort 
of analog of Eq. (40), then Eq. (59) is a similar analog of the expansion expressed by Eq. (37). 


The matrix presentation is so convenient that it makes sense to move it by one level lower - from 
state vector products to “bare” state vectors resulting from operator’s action upon a given state. For 
example, let us use Eq. (59) to present the ket-vector (18) as 


a') = A \a) = 


Zl u j) a 


\J’J 


a) = 


Zb-KX 


u r 


a 


(4.60) 


According to Eq. (40), the last short bracket is just ay, so that 


b = Zb)V*/=Z Z A 


Jf a f 


J \ J 


(4.61) 


But expression in middle parentheses is just the coefficient a) of expansion (37) of the resulting ket- 
vector (60) in the same basis, so that 

a ’,=IVv- < 4 - 62 > 

f 


This result corresponds to the usual rule of multiplication of a matrix by a column, so that we may 
represent any ket-vector by its column matrix, with the operator action looking like 


aj ' 


( a 

Ai 

A 12 

\ 

' a ^ 

a' 2 

= 

A 21 

^22 


a 2 

... ) 


v •" 


—J 

v ••• J 


(4.63) 


Absolutely similarly, the operator action on the bra-vector (21), represented by its row-matrix, is 


Operator’s 
expression 
via its 
matrix 
elements 
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(4.64) 


Hermitian 

conjugate’s 

matrix 

elements 


By the way, Eq. (64) naturally raises the following question: what are the elements of the matrix 
in its right-hand part, or more exactly, what is the relation between the matrix elements of an operator 
and its Hermitian conjugate? The simplest way to get an answer is to use Eq. (25) with two arbitrary 
states (say, uj and u , ) of the same basis in the role of a and [l Together with the orthonormality relation 
(38), this immediately gives 13 



(4.65) 


Thus, the matrix of the Hermitian conjugate operator is the complex conjugated and transposed matrix 
of the initial operator. This result exposes very clearly the essence of the Hermitian conjugation. It also 
shows that for the Hermitian operators, defined by Eq. (22), 


A jr =A rj ., 


(4.66) 


i.e. any pair of their matrix elements, symmetric about the main diagonal, should be complex conjugate 
of each other. As a corollary, the main-diagonal elements have to be real: 

A ji =A *ji> i-e. IvaAjj = 0. (4.67) 


(Matrix (50) evidently satisfies Eq. (66), so that the identity operator is Hermitian.) 


In order to fully appreciate the special role played by Hermitian operators in the quantum theory, 
let us introduce the key notions of eigenstates aj (described by their eigenvectors (aj\ and | ajj) and 
eigenvalues (c-numbers) Aj of an operator A , defined by the equation they have to satisfy: 14 

Operator’s 
eigenstates 
and 

eigenvalues 





A 

a ■) = A. 

a ) . 


J / J 

J / 


(4.68) 


Let us prove that eigenvalues of any Hermitian operator are real, 15 


Hermitian 

operator’s 

eigenvalues 


A J= A J> 


for y' = l, 2 


(4.69) 


13 For the sake of formula compactness, below I will use the shorthand notation in which the operands of this 
equality are just A^j- and A*jy. I believe that it leaves little chance for confusion, because the Hermitian 
conjugation sign f may pertain only to an operator (or its matrix), while the complex conjugation sign * to a 
scalar - say a matrix element. 

14 This equation should look familiar to the reader - see the stationary Schrodinger equation (1.60), which was the 
focus of our studies in the first three chapters. We will see soon that that equation is just a particular (coordinate) 
representation of Eq. (66) for the Hamiltonian as the operator of energy. 

15 The reciprocal statement is also true: if all eigenvalues of an operator are real, it is Hermitian (in any basis). 
This statement may be readily proved by applying Eq. (93) below to the case when A kk - = A k (\±\ with A k * = A k . 
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while the eigenstates corresponding to different eigenvalues are orthogonal: 


{ a j\ a r) 


0, if Aj*A r 


(4.70) 


The proof of both statements is surprisingly simple. Let us inner-multiply both sides of Eq. (68) 
by bra-vector {ay |. In the right-hand part of the result, the eigenvalue Aj, as a c-number, may be taken out 
of the bra-ket, giving 


Hermitian 

operator’s 

eigenvectors 


( a r\A\ a j) = A j(a r \ a j). (4.71) 

This equality should hold for any pair of eigenstates, so that we may swap the indices in Eq. (71), and 
complex-conjugate the result: 


\A\a, 



(4.72) 


Now using Eqs. (14) and (25), together with the Hermitian operator definition (22), we may transform 
Eq. (72) to the following form: 


( a r Wj) = A *r{ a r\ a j)- 

Subtracting this equation from Eq. (71), we get 



(4.73) 


(4.74) 


There are two possibilities to satisfy this equation. If indices j and j ’ are equal (denote the same 
eigenstate), then the bra-ket is the state’s norm squared, and cannot be equal to zero. Then the left 
parentheses (with j =j ’) have to be zero, i.e. Eq. (69) is valid. On the other hand, if j and j ’ correspond to 
different eigenstates, the parentheses cannot equal zero (we have just proved that all Aj are real!), and 
hence the state vectors indexed by j and j ’ should be orthogonal, e.g., Eq. (70) is valid. 

As will be discussed below, these properties make Hermitian operators suitable for the 
description of physical observables. 


4,4. Change of basis and matrix diagonalization 

From the discussion of last section, it may look that the matrix language is fully similar to, and in 
many instances more convenient than the general bra-ket formalism. In particular, Eqs. (52), (54), (55) 
show that any part of any bra-ket expression may be directly mapped on the similar matrix expression, 
with the only slight inconvenience of using not only columns, but also rows (with their elements 
complex-conjugated), for state vector presentation. In this context, why do we need the bra-ket language 
at all? The answer is that the elements of the matrices depend on the particular choice of the basis set, 
very much like the Cartesian components of a usual vector depend on the particular choice of reference 
frame orientation (Fig. 4), and very frequently it is convenient to use two or more different basis sets for 
the same system. 

With this motivation, let us study what happens if we change from one basis, {«}, to another 
one, { v} - both full and orthonormal. First of all, let us prove that for each such pair of bases, there 
exists such an operator U that, first, 
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definition 


Unitary 
operator 
of basis 
transform 


Conjugate 

unitary 

transform 

operator 


Reciprocal 

basis 

transform 



(4.75) 


and, second, 

A A 4- A 4- A /V 

uw = WU = 1 . 


(4.76) 


(Due to the last property, 16 U is called a unitary operator, and Eq. (75), a unitary transformation .) 



Fig. 4.4. Transformation of 
components of a 2D vector at 
a reference frame rotation. 


A very simple proof of both statements may be achieved by construction. Indeed, let us take 



ill 

v f)i u r 1 

n 

(4.77) 

- an evident generalization of Eq. (44). Then 

u\ u j) = Y\ v r)( u r\ u j) = Y\ 

j' / 

so that Eq. (75) has been proved. Now, applying Eq. (3 1) to 

K), (4.78) 

each term of sum (77), we get 


/ 

u f)( v r 

| ? 

(4.79) 


so that 


^=1 


v j )\ u j fr/vr 


■1=11 vMI = Z 




(4.80) 


J>J 


J’J 


But according to the closure relation (44), the last expression is just the identity operator, q.e.d. 17 (The 
proof of the second equality in Eq. (76) is absolutely similar.) 

As a by-product of our proof, we have also got another important expression (79). It implies, in 
particular, that while, according to Eq. (77), operator U performs the transfonn from the “old” basis Uj 
to the “new” basis vj, its Hermitian adjoint U ' performs the reciprocal unitary transform: 


c/t 

ii 

> 

JS 

II 

Uj). 


(4.81) 


/V 4- A 1 

16 An alternative way to express Eq. (76) is to write U =U , but I will try to avoid this language. 

17 Quod erat demonstrandum (Lat.) - what needed to be proved. 
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Now, let us see how do the matrix elements of the unitary transfonn operators look like. 
Generally, as was stated above, operator’s elements depend on the basis we calculate them in, so we 
should be careful - initially. For example, let us calculate the elements in basis {u}: 


U 


jj | in m 


= (U 


\Uu,) = 


Xh-)( i 


V k 


U = ( w 


J \ J 


(4.82) 


Now performing a similar calculation in basis { v} , we get 


JJ |mv \ j 


= (v, \U\v,.,) = (V, 


El 

V k 


Vj, =Ulj Vj, ). 


(4.83) 


Surprisingly, the result is the same! This is of course true for the Hermitian conjugate of the unitary 
transfonn operator as well: 


U 


JJ I 


-ul 


JJ |mv 


= {v j\ u j' 


(4.84) 


These expressions may be used, first of all, to rewrite Eq. (75) in a more direct form. Applying 
the first of Eqs. (41) to state vyof the “new” basis, we get 




II 

> 

u j)(Uj 

v}=E^ 

j 

Uj) ■ 

(4.85) 

Similarly, the reciprocal transform 

is 







II 


4=E 4 

j 

*,)■ 

(4.86) 


Basis 

transforms 

matrix 

form 


These equations are very convenient for applications; we will use them already later in this section. 

Next, we may use Eqs. (83), (84) to express the effect of the unitary transform on expansion 
coefficients (37) of vectors of an arbitrary state a. In the “old” basis {«}, they are given by Eq. (40). 
Similarly, in the “new” basis {v} , 


a j |inv 




(4.87) 


Again inserting the identity operator in the form of closure (44), with internal index j ’, and then using 
Eq. (84), we get 


<*J |inv =\ V J 


E 

V f 


u f A u f 


E 


a)= 7 Wj \ Uj, )(u f \a) = 


T.u\ 


U;,\a) = 


E4«,k • < 4 - 88 > 


The reciprocal transform is (of course) performed by matrix elements of operator U : 

a \ = V U ,a J • 

j | in u jj j | in v 


(4.89) 


Both structurally and philosophically, these expressions are similar to the transformation of 
components of a usual vector at coordinate frame rotation. For example, in two dimensions (Fig. 4): 
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'a'' 


v a ,; 


^ COS (p 

shn^k 


v - sin cp 

COS (P; 

5, 


(4.90) 


(In this analogy, the equality to 1 of the determinant of the rotation matrix in Eq. (90) corresponds to the 
unitary property (76) of the unitary transfonn operators.) Please pay attention here: while the transform 
(75) from the “old” basis {u} to the “new” basis {v} is performed by the unitary operator, the change 
(88) of a state vectors components at this transfonnation requires its Hermitian conjugate. Actually, this 
is also natural from the point of view of the geometric analog of the unitary transform (Fig. 4): if the 
“new” reference frame {x y ’} is obtained by a counterclockwise rotation of the “old” frame {x, y} by 
some angle <p, for the observer rotating with the frame, vector a (which is itself unchanged) rotates 
clockwise. Due to the analogy between expressions (88) and (89) on one hand, and our old friend Eq. 
(62) on the other hand, it is tempting to skip indices in our new results by writing 

la). =U^\a). , la). =U\a). . (4.91) 

l / in v I /mu I /mu I / in v 


A A *1* 

Since matrix elements of U and U 1 do not depend on basis, such language is not too bad; still, the 
symbolic Eq. (91) should not be confused with genuine (basis-independent) bra-ket equalities. 

Now let us use the same trick of identity operator insertion, repeated twice, to find the 
transformation rule for matrix elements of an arbitrary operator: 



r 

\ 


r \ 


A A^=( v jWj)=( v j\ 

Ya \ U k){ U k\ 

A 

Z MM 


V k 

7 


\ k> 7 

k,k’ J 


Matrix , 

elements’ absolutely similarly, we can get 
transforms 


A I = Y/7 A I U I" 

At/' | in u ~ jk^kk' \ in v k'j' • 

k,k' 


(4.92) 


(4.93) 


In the spirit of Eq. (91), we may present these results symbolically as well, in a compact bra-ket form: 


A\„=uU\ mu U, 


,=ua\ 


(4.94) 


As a sanity check, let us apply this result to the identity operator: 

A . / A 4- A A \ / A -j- A \ A I 

i\. = \wiu\ = \u'u\ =/. 

in v in u 


(4.95) 


- as it should be. One more invariant of the basis change is the trace of any operator, defined as the sum 
of the diagonal terms of its matrix in a certain basis: 


Operator/ 

matrix 

trace 


Tr A = Tr A = Y A .. . 

jLu JJ 

j 


(4.96) 


The (easy) proof of this fact, using the relations we have already discussed, is left for reader’s exercise. 

So far, I have implied that both state bases { u } and {v} are known, and the natural question is 
where does this information comes from in quantum mechanics of actual physical systems. To get a 
partial answer to this question, let us return to Eq. (68) that defines eigenstates and eigenvalues of an 
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operator. Let us assume that the eigenstates a 7 of a certain operator A form a full and orthonormal set, 
and find the matrix elements of the operator in the basis of these states. For that, it is sufficient to inner- 
multiply both sides of Eq. (68), written for index j ’, by the bra-vector of an arbitrary state a, of the same 
set: 

( a j |^| a r ) = { a j | A r | a r ) ■ ( 4 - 97 ) 


The left-hand part is just the matrix element Ay we are looking for, while the right hand part is just 
Aj’Sjj’. As a result, we see that the matrix is diagonal, with the diagonal consisting of eigenvalues: 


A... 

= A S ... 

J] 

J JJ 


(4.98) 


In particular, in the eigenstate basis (but not necessarily in an arbitrary basis!), Ay means the same as Aj. 
Thus the most important problem of finding the eigenvalues and eigenstates of an operator is equivalent 
to the diagonalization of its matrix, 18 i.e. finding the basis in which the corresponding operator acquires 
the diagonal form (98); then the diagonal elements are the eigenvalues, and the basis itself is the 
desirable set of eigenstates. 


Matrix 
elements in 
eigenstate 
basis 


Let us modify the above calculation by inner-multiplying Eq. (68) by a bra-vector of a different 
basis - say, the one, denoted {u}, in which we kn ow the matrix elements Ay. The multiplication gives 

{ u k P| a j ) = (u k \Aj | ) . (4.99) 


In the left-hand part we can (as usual :-) insert the identity operator, between the operator and the ket- 
vector, and then use the closure relation (44), while in the right-hand part, we can move the eigenvalue 
Aj out of the bra-ket, and then insert a summation over a new index, compensating it with the proper 
Kronecker delta symbol: 

(«* | “*■ )(«*■ | a j ) = Aj £ («*■ | a j )8 kk , . (4.1 00) 

k' k' 


Moving out the sign of summation over k\ and using definition (47) of the matrix elements, we get 


h-' 


But the set of such equalities, for all N possible values of index k, is just a system of linear, 
homogeneous equations for unknown c-numbers (ufaf. But according to Eqs. (82)-(84), these numbers 
are nothing else than the matrix elements Ukj of a unitary matrix providing the required transformation 
from the initial basis {u} to the basis {a} that diagonalizes matrix A. The system may be presented in 
the matrix form: 


Operator 
(4.102) diagonali- 
zation 



18 Note that expression “matrix diagonalization” is a co mm on and convenient, but dangerous jargon. (A matrix is 
just a matrix, an ordered set of c-numbers, and cannot be diagonalized.) It is OK to use this jargon if you 
remember clearly what it actually means - see the definition above. 
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and the usual condition of its consistency, 


Characteristic 


A n Aj A X2 


equation 
for finding 


^2i A 22 —Aj 

= 0, 

eigenvalues 





(4.103) 


plays the role of the characteristic equation of the system. This equation has N roots Aj,; plugging each 
of them back into system (102), we can use it to find N matrix elements (k = 1, 2, ...AO 
corresponding to this particular eigenvalue. However, since equations (103) are homogeneous, they 
allow finding Ukj only to a constant multiplier. In order to ensure their normalization, i.e. the unitary 
character of matrix U, we may use the condition that all eigenvectors are normalized (just as the basis 
vectors are): 



(4.104) 


for each j. This normalization completes the diagonalization. 19 

Now (at last!) I can give the reader some examples. As a simple but very important case, let us 
diagonalize the operators described (in a certain 2-function basis { u }) by the so-called Pauli matrices 


Pauli 

matrices 



f 0 1) 


0 

1 


fl 0 > 


Gy = 


, a,, = 


, a. = 




ll oj 


l 1 ' 0 2 


-\ 

i 

o 



(4.105) 


Though introduced by a physicist, with a specific purpose to describe electron’s spin, these matrices 
have a general mathematical significance, because together with the 2x2 identity matrix I, they provide 
a full, linearly-independent 2x2 basis - meaning that an arbitrary 2x2 matrix may be presented as 


r 


V 


A, 


A 


21 


x 12 


A 


22 7 


= a 0 I + a x G x +a y c y +a z G z , 


(4.106) 


with a unique set of 4 coefficients a. 


Let us start with diagonalizing matrix a x . For it, the characteristic equation (103) is evidently 



(4.107) 


and has two roots, A i 2 = ±1. (Again, the numbering is arbitrary!) The reader may readily check that the 
eigenvalues of matrices a v and g z are similar. However, the eigenvectors of the operators corresponding 
to all these matrices are different. To find them for o x , let us plug its first eigenvalue, A\ - +1, back into 
equations (101), written for this particular case: 


~(u l |o 1 ) + (m 2 l^q) = 0, 
{u l | a l )-(u 2 |a,) = 0. 


(4.108) 


19 A possible slight complication here are degenerate cases when characteristic equation gives certain equal 
eigenvalues corresponding to different eigenvectors. In this case the requirement of the mutual orthogonality of 
these states should be additionally enforced. 
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The equations are compatible (of course, because the used eigenvalue A\ = +1 satisfies the characteristic 
equation), and any of them gives 

(wj |a,) = (u 2 1 ), i.e. U u =U 2l . (4.109) 

With that, the normalization condition (104) yields 

n 2 =w 2 4 Hue) 


Although the normalization is insensitive to the simultaneous multiplication of Un and U 21 by the same 
phase factor cxpJ/Ypj with any real cp, it is convenient to keep the coefficients real, for example taking <p 
= 0, i.e. to get 


V "= U "=T2- 


(4.111) 


Performing an absolutely similar calculation for the second characteristic value, A 2 = -1, we get 
Un = -C/ 22 , and we may choose the common phase to get 

1 


^12 '-'22 ’ 

so that the whole unitary matrix for diagonalization of the operator corresponding to a x is 


20 



(4.112) 


Unitary 
(4.113) matrix 

diagonalizing 


For what follows, it will be convenient to have this result expressed in the ket-relation form - see Eqs. 
(85)-(86): 


+ £/ 2 i|w 2 ) 

1 A \ 1 \ \ 

|a 2 ) — U n \u\/ 

) + U 22 

1 i\ 

= tp Mi / + I m 2/> 

= 7^ 

) + U 2 1 1 ^ 2 

L l /l \ 1 \ \ 



, 1 

) = ^ II a i ) + a 2 ))’ 

|W 2 / = U\ 2 \ a \ 

/ + U 22 \a 2i 

)- 7I 


These results are already sufficient to understand the Stem-Gerlach experiments described in 
Sec. 1 - with two additional postulates. The first of them is that particle’s interaction with external 
magnetic field may be described by the following vector operator of the dipole magnetic moment: 21 


m = yS, 


(4.116) 


Magnetic 

moment 

operator 


where the coefficient y, specific for every particle type, is called the gyromagnetic ratio , 12 and S is the 
vector operator of spin. For the so-called spin-V2 particles (including the electron), this operator may be 
represented, in the so-called z-basis, by the following 3D vector of the Pauli matrices (105): 


20 Note that though this particular unitary matrix is Hermitian, this is not true for an arbitrary choice of phases (p. 

21 This is the key point in the electron’s spin description, developed by W. Pauli in 1925-1927. 

22 For an electron, with its negative charge q = -e, the gyromagnetic ratio is negative: y e = -g e e/2m e , where g e ~ 2 
is the dimensionless g-f actor. Due to quantum electrodynamics effects, the factor is slightly higher than 2: g e = 
2(1 + cd2n+ 2.0023 19304. . ., where a = e 2 IAnsoTic » 1/137 is the fine structure (“Sommerfeld”) constant. 
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(4.117) 


and n xyz are the usual Cartesian unit vectors in 3D space. (In the quantum-mechanics sense, they are just 
c -numbers, or rather “c-vectors”.) The z-basis, in which Eq. (177) is valid, is defined as an orthonormal 
basis of two states, frequently denoted T an -l, in which the z-component of the vector operator of spin is 
diagonal, with eigenvalues +H/2 and -Til 2. Note that we do not “understand” what exactly these states 
are, 23 but loosely associate them with a certain internal rotation of the electron about z-axis, with either 
positive or negative angular momentum component S z . However, any attempt to use such classical 
interpretation for quantitative predictions runs into fundamental difficulties - see Sec. 5.7 below. 


The second new postulate describes the general relation between the bra-ket formalism and 
experiment. 24 Namely, in quantum mechanics, each real observable A is represented by a Hermitian 

operator A = A, and a result of its measurement in a quantum state a, described by a linear 
superposition of the eigenstates a, of the operator, 

\ a ) = ^l a j\ a j ) 5 with cij = (aj | a j , (4.118) 


may be only one of corresponding eigenvalues Aj . 25 If state (118) and all eigenstates a,- are normalized to 
unity, 


(a\a) = 1, 



(4.119) 


then the probability of outcome Aj is 26 

Quantum 
measurement 
potulate 


I I 2 * 

Wj = \a A = a jOc j = 


a\a j )(a J 


a 


(4.120) 


This relation is evidently a generalization of Eq. (1.22) in wave mechanics. As a sanity check, let us 
assume that the set of eigenstates a, is full, and calculate the sum of all the probabilities: 


Vj = ^j( a \ a j)( a j |«) = (a|/|a) = 1 . (4.121) 

j j 

Now returning to the Stern-Gerlach experiment, conceptually the description of the first (z- 
oriented) experiment shown in Fig. 1 is the hardest for us, because the statistical ensemble describing 
the unpolarized electron beam at its input is mixed (“incoherent”), and cannot be described by a pure 


23 If you think about it, word “understand” typically means that we can explain a new, more complex notion in 
terms of those discussed earlier and considered “known”. In our example, we cannot express the spin states by 
some wavefunction i//(r), or any other mathematical notion discussed earlier. The bra-ket formalism has been 
invented exactly to enable mathematical analysis of such “new” quantum states. 

24 Here again, just like in Sec. 1.2, the statement implies the abstract (mathematical) notion of “ideal 
experiments”, postponing the discussion of real (physical) measurements until Sec. 7.7. 

25 As a reminder, in the end of Sec. 3 we have already proved that such eigenstates corresponding to different Aj 
are orthogonal. If any of these values is degenerate, i.e. corresponds to several different eigenstates, they should 
be also selected orthogonal, in order for Eq. (1 18) to be valid. 

26 This key relation, in particular, explains the most co mm on term for the (generally, complex) coefficients a„ the 
probability amplitudes. 
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(“coherent”) superposition of the type (6) that have been the subject of our studies so far. (We will 
discuss the mixed ensembles in Chapter 7.) However, it is intuitively clear that its results, and in 
particular Eq. (6), are compatible with the description of its two output beams as sets of electrons in pure 
states T and -l, respectively. The absorber following that first stage (Fig. 2) just takes all spin-down 
electrons out of the picture, producing an output beam of polarized electrons in a pure T state. For such 
beam, probabilities (120) are Wf = 1 and Wi = 0. This is certainly compatible with the result of the 
“control” experiment shown on the bottom panel of Fig. 2: the repeated SG (z) stage does not split such 
a beam, keeping the probabilities the same. 

Now let us discuss the double Stern-Gerlach experiment shown on the top panel of Fig. 2. For 
that, let us present the z- polarized beam in another basis of two states (I will denote them as — » and <— ) 
in which, by definition, the matrix of operator S x is diagonal. But this is exactly the set we called a \ t 2 in 

the a x matrix diagonalization problem solved above. On the other hand, states T and i are exactly what 
we called u \ ,2 in that problem, because in this basis, matrices o : and hence S z are diagonal. Hence, in 
application to the electron spin problem, we may rewrite Eqs. (1 14)-( 115) as 

(4.122) Relation 

between 
eigenvectors 
of operators 

(4.123) S x and S z 


Currently, for us the first of Eqs. (123) is most important, because it shows that the quantum 
state of electrons entering the SG (x) stage may be presented as a coherent superposition of electrons 
with S x = +hl 2 and S x = -fi/2. Notice that the beams have equal probability amplitude moduli, so that 
according to Eq. (122), the split beams — > and <— have equal intensities, in accordance with experiment. 
(The minus sign before the second ket-vector is of no consequence here, though it may have an impact 
on outcome of other experiments - for example if the — > and <— beams are brought together again.) 

Now, let us discuss the most mysterious (from the classical point of view) multi-stage SG 
experiment shown on the middle panel of Fig. 2. After the second absorber has taken out all electrons in, 
say, the <— state, the remaining electrons in state — > are passed to the final, SG (z), stage. But according 
to the first of Eqs. (122), this state may be presented as a (coherent) linear superposition of the T and i 
states, with equal amplitudes. The stage separates these two states into separate beams, with equal 
probabilities Wf = Wi = Vi to find an electron in each of them, thus explaining the experimental results. 

To conclude our discussion of the multistage Stern-Gerlach experiment, let me note that though 
it cannot be explained in terms of wave mechanics (which operates with scalar de Broglie waves), it has 
an analogy in classical theories of vector fields, such as the classical electrodynamics. Fet a plane 
electromagnetic wave propagate perpendicular to the plane of drawing, and pass through linear polarizer 
1. Similarly to the initial SG (z) stages (including the following absorbers) shown in Fig. 2, the polarizer 
produces a wave linearly polarized in one direction - the vertical direction in Fig. 3. Its electric field 
vector has no horizontal component, as may be revealed by wave’s full absorption in a perpendicular 
polarizer 3. However, let us pass the wave through polarizer 2 first. In this case, the output wave does 
acquire a horizontal component, as can be, again, revealed by passing it through polarizer 3. If angles 
between polarization direction 1 and 2, and between 2 and 3, are both equal zr/4, each polarizer reduces 
the wave amplitude by a factor of a/2, and hence intensity by a factor of 2, exactly like in the multistage 
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SG experiment, with polarizer 2 playing the role of the SG (x) stage. The “only” difference is that the 
necessary angle is 7rl 4, rather than by n!2 for the Stem-Gerlach experiment. In quantum electrodynamics 
(see Chapter 9 below), which confirms the classical predictions for this experiment, this difference is 
explained by that between the integer spin of the electromagnetic field quanta, photons, and the half- 
integer spin of electrons. 



Fig. 4.3. Light polarization sequence similar to the 3-stage 
Stem-Gerlach experiment shown on the middle panel of Fig. 2. 


4,5. Observables: Expectation values and uncertainties 

After this particular (and hopefully expiring) example, let us discuss the general relation between 
the Dirac formalism and experiment in more detail. The expectation value of an observable over any 
statistical ensemble (not necessarily coherent) may be always calculated using the general rule (1.37). 
For the particular case of a coherent superposition (118), we can combine that definition with Eq. (120) 
and the second of Eqs. (118), and then use Eqs. (59) and (98) to write 

(■ a )=Yj a j w j = Yj a * A J a J = Y,{ a \ a j) A j{ a j\ a ) = Yj{ a \ a j){ a M\ a r){ a j'\ a )- ( 4 - 124 ) 

j j j jj' 


Now using the completeness relation (44) twice, with indices j and j’, we arrive at a very simple and 
important formula 27 


(jfj = (a\A\a). 


(4.125) 


This is a clear analog of the wave-mechanics formula (1.23) - and as we will see in the next chapter, 
may be used to derive it. A huge advantage of Eq. (125) is that it does not explicitly involve the 
eigenvector set of the corresponding operator, and allows the calculation to be perfonned in any 
convenient basis. 28 


For example, let us consider an arbitrary state a of spin-Vi, and calculate the expectation values 
of its components. The calculations are easiest in the z-basis, because we know the operators of the 
components in that basis - see Eq. (117). Representing the ket- and bra-vectors of our state as linear 
superpositions of vectors of the basis states T and l, 

\a) = a t |T^ + 0|j4^, (a| = ^T|a* + ^|a* . (4.126) 


27 This equality reveals the full beauty of Dirac’s notation. Indeed, initially the quantum-mechanical brackets just 
reminded the angular brackets used for statistical averaging. Now we see that in this particular (but most 
important) case, the angular brackets of these two types may be indeed equal to each other! 

28 Note that Eq. (120) may be rewritten in the form similar to Eq. (125): Wj = (a | A j | a'j , where A j =|a^a y .| 
is the operator (42) of projection upon the / h eigenstate a,. 
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and plugging these expressions to Eq. (125) written for observable S z , we get 

&) = ((%% (4 1 a* y, (a r | ' t ) + | ^)) 

= a f a* (t | S z | T) + a^a* (i \S : |>U) + a r a* (4 \S Z \ T ) + a+a* (T | S s \ i). 


(4.127) 


Now there are two equivalent ways (both very simple :-) to calculate the bra-kets in this 
expression. The first one is to represent each of them in the matrix form in the z-basis, in which bra- and 
ket-vectors of states T and l are, respectively, matrix-rows (1, 0) and (0, 1), or the similar matrix- 
columns. Another (perhaps more elegant) way is to use the general Eq. (59), for the z-basis, to write 



t)(tH 


>■ 

fid- 

-fid 


fid- 


). 


(4.128) 


Spin-V 2 

component 

operators 


For our particular calculation, we may plug the last of these expressions into Eq. (127), and to use the 
orthonormality conditions (119): 


(t|t) = (J.|4.) = l, (t|4.) = (4.|t) = o. 


(4.129) 


Both calculations give (of course) the same result: 



OC-^OC-^ oc ^oc ^ 


(4.130) 


This particular result might be also obtained using Eq. (120) for probabilities Wf = at at* and Wi = 
aiai*: 


S.) = W t 


f fi > 
+ — 

V 2y 


+ W 




v 


* 

— OC-^OC^ 


+ — 

V 2y 




( 


\ 


(4.131) 


The fonnal way (127), based on using Eq. (125), has, however, an advantage of being applicable, 
without any change, to finding the observables whose operators are not diagonal in the z-basis, as well. 
In particular, absolutely similar calculations give 

( 5 *> = a T a t(^|^x|^) + «4,«r(^|^|^) + «t a *(^|^|^) + a 4 a t(^|^K) = |( a T a * + (4.132) 


(S y } = a r a* (t | S y 1 T) + (l \S y | i) + a f a* (i |s, | T) + a^a* (T \S y |l) = * | ( a t a * - a^a* ), (4. 1 33) 


Similarly, we can express, via the same coefficients at and at, the r.m.s. fluctuations of all spin 
components. For example, let us have a good look at the spin state T. According to Eq. (126), in this 
state at = 1 and at = 0, so that Eqs. (1 30)-( 133) yield: 

(S,) = \. <«,) = («,) = 0. (4.134) 

Now let us use the same Eq. (125) to calculate the spin component uncertainties. According to Eqs. 
(105) and (117), operators of spin component squared are equal to (hi 2) / , so that the general Eq. (1.33) 
yields 
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= 79 (t|/|t)-W =0, (4.135a) 

yzj yzj 

(<®J =(s; )-{S, f =(t|^|t)-0 = ('|l = , (4.135b) 

(<S,) 2 =(S y 2 >-(S,) 2 =(T|j;|t}-0 = W <t|/|t) = f|') . (4.135c) 

yzj yzj 

While Eqs. (134) and (135a) are compatible with the classical notion of the spin being 

“definitely in the T state”, this correspondence should not be overstretched to the interpretation of this 

state as a certain (z) orientation of electron’s magnetic moment m, because such classical picture cannot 
explain Eqs. (135b) and (135c). The best (but still imprecise!) classical image I can offer is the magnetic 
moment m oriented, on the average, in the z-direction, but still having x- and v-components strongly 
“wobbling” about their zero average values. 

It is straightforward to verify that in the x-polarizcd and y-polarized states the situation is similar, 
with the corresponding change of indices. Thus, in neither state may all 3 components of the spin have 
exact values. Let me show that this is not just an occasional fact, but reflects the most profound property 
of quantum mechanics, the uncertainty relations. Consider 2 observables, A and B , that may be 
measured in the same quantum state. There are two possibilities here. If operators corresponding to the 
observables commute, 

[i,i]=0, (4.136) 

then all the matrix elements of the commutator in any orthogonal basis (in particular, in the basis of 
eigenstates q, of operator A) are also zero. From here, we get 

(cij\A,B \a j}] = (a ^AB^a j)j - (a j^BA^a ^ = 0 . (4.137) 

In the first bra-ket of the middle expression, let us act by operator A on the bra-vector, while in the 
second one, on the ket-vector. According to Eq. (68), such action turns operators into the corresponding 
eigenvalues, so that we get 

Aj (a,j l^a^- Aj.iaj = Aj-Aj, (a ^B^a ^ = 0 . (4.138) 

This means that if eigenstates of operator A are non-degenerate (i.e. Aj ^ Aj • if j ^ j ’), the matrix 
of operator B has to be diagonal in basis a y , i.e., the eigenstate sets of operators A and B coincide. 
Such pairs of observables, that share their eigenstates, are called compatible. For example, in wave 
mechanics of a particle, momentum (1.26) and the kinetic energy (1.27) are compatible, sharing 
eigenfunctions (1.29). Now we see that this is not occasional, because each Cartesian component of the 
kinetic energy is proportional to the square of the corresponding component of the momentum, and any 
operator commutes with an arbitrary power of itself: 

A,A"j= A,AA...A = AAA...A- AA...AA = 0. (4.139) 
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Now, what if operators A and B do not commute? Then the following general uncertainty 
relation is valid: 29 


8A8B >- 
2 



(4.140) 


The proof of Eq. (140) may be divided into two steps, the first of which proves the so-called Schwartz 
inequality : 30 


(a\a)(p\p)>\(a\p)\ . 


(4.141) 


The proof may be started by using postulate (16) - that the norm of any legitimate state of the system 
cannot be negative. Let us apply this postulate to the state with the following ket-vector: 




a)- 


(a|a) 


ia>, 


(4.142) 


where a and ft are possible, non-null states of the system, so that the denominator in Eq. (142) is not 
equal to zero. For this case, Eq. (16) gives 


a\- 


a \j3 

(a|a) 


a! 


\a)- 


\P) 


Opening the parentheses, we get 

la 


a a)- 


(A |A) 




a 


(A | A) 


a \m + 


P\ a 

(a|a> 

a\0)(p\a 


> 0 . 


(A|A)‘ 


(A|A)so. 


(4.143) 


(4.144) 


After the cancellation of one inner product (ft \/J) in the nominator and denominator of the last term, it 
cancels with the 2 rd (or 3 rd ) term, proving the Schwartz inequality (141). 

Now let us apply this inequality to states 

\a) = A\y^/ and \p) = B\y^/, (4.145) 

where, in both relations, y is the same (but otherwise arbitrary) possible state of the system, and the 
deviations operators are defined similarly to observable deviations (see Sec. 1.2), for example, 

A = A-(A). (4.146) 


With this substitution, and taking into account that the observable operators A and B are Hermitian, 
Eq. (141) yields 


(y\A 2 \y)(y\B 2 \y)> ( y\AB\y 


(4.147) 


29 Note that both sides of Eq. (140) are state-specific; the uncertainty relation statement is that this inequality 
should be valid for any possible quantum state of the system. 

30 This inequality is the quantum- mechanical analog of the usual vector algebra result cC ft > |a-p| 2 . 


General 

uncertainty 

relation 


Schwartz 

inequality 
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Since state yis arbitrary, we may use Eq. (125) to rewrite this relation as an operator inequality: 


8A5B> 



(4.148) 


Actually, this is already an uncertainty relation, even “better” (stronger) than its standard form 
(140); moreover, it is more convenient in some cases. In order to proceed to Eq. (140), we need a couple 
more steps. First, let us notice that the operator product in Eq. (148) may be recast as 




where C = i 



(4.149) 


Any anticommutator of Hermitian operators, including that in Eq. (149), is a Hermitian operator, and its 
eigenvalues are purely real, so that its expectation value (in any state) is also purely real. On the other 
hand, the commutator part of Eq. (149) is just 


C = i 


A,B 


= i (A - ( A)\b - {B))~ i(B - (b)\A - (A)) = i (aB - BA ) 


= i AB-BA =i 


A,B 


(4.150) 


Second, according to Eqs. (52) and (65), the Hennitian conjugate of any product of Hermitian operators 
A and B is just the product of swapped operators. Using the fact, we may write 


C ' = / 


(i(4, £]) 


11 = -i(AB )* + i(BA) f = -iBA + iAB = i A, B 


= C. 


(4.151) 


so that operator C is also Hermitian, i.e. its eigenvalues are also real, and thus its average is purely real 
as well. As a result, the square of the average of the operator product (149) may be presented as 



(4.152) 


Since the first term in the right-hand part of this equality cannot be negative, 


and we can continue Eq. (148) as 


thus proving Eq. (140). 



5A8B> 



1 

/ 

' A ^ 

\ 

AB 

> — 

( 

A,B 

) 

\ / 

2 

\ 


/ 


(4.153) 

(4.154) 


For the particular case of operators x and p x (or a similar pair of operators for another Cartesian 

coordinate), we can readily combine Eq. (140) with Eq. (2.14b) and to prove the original Heisenberg’s 

uncertainty relation (2.13). For the spin- 1/2 operators defined by Eq. (117), it is straightforward (and 

Commutation highly recommended to the reader) to show that 

relation 
for spin- 1/2 
component 
operators 

with similar relations for other pairs of indices taken in the “correct” order (from x to y to z to x, etc.). 
As a result, the uncertainty relations (140) for spin-1/2 particles, notably including electrons, are 


U - 


\s x ,s v 

II 

'*■«» . 




(4.155) 
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etc. 


(4.156) 


Uncertainty 
relations 
for spin-1 12 
components 


In particular, in the T state, the right-hand part of this relation equals (ft/2) 2 , and neither of the 
uncertainties 8S X , SS V can equal zero. As a reminder, our direct calculation earlier in this section has 
shown that each of these uncertainties is equal to ft/2, i.e. their product equals to the lowest value 
allowed by the uncertainty relation (156). In this aspect, the spin-polarized states are similar to the 
Gaussian wave packets studied in Sec. 2.2. 


4.6. Quantum dynamics: Three pictures 

So far in this chapter, I shied away from the discussion of system dynamics, implying that the 
bra- and ket-vectors of the system are their “snapshots” at a certain instant t. Now we are sufficiently 
prepared to examine their time dependence. One of the most beautiful features of quantum mechanics is 
that the time evolution may be described using either of three alternative “pictures”, giving exactly the 
same final results for expectation values of all observables. 


From the standpoint of our wave mechanics experience, the Schrodinger picture is the most 
natural. In this picture, the operators corresponding to time-independent observables (e.g., to the 
Hamiltonian function H of an isolated system) are also constant, while the bra- and ket-vectors of the 
quantum state of the system evolve in time as 


{a(t) 

= («(G) 

u'(t,t 0 ), \a(t)) = u(t,t 0 ) 

a(t 0 )), 


(4.157) 


where u(t,t 0 ) is the time-evolution operator, which obeys the following differential equation: 


ifiu = Hu, 


(4.158) 


r. /\ 

where H is the Hamiltonian operator of the system (that is always Hermitian, II = H ), and the dot 
means the differentiation is over argument t, but not to. While this equation is a very natural replacement 
of the wave-mechanical equation (1.25), and is also frequently called the Schrodinger equation , 31 it still 
should be considered as a new, more general postulate, which finds its final justification (as it is usual in 
physics) in the agreement between its corollaries with experiment - more exactly, in having not a single 
credible contradiction with experiment. 


Schrodinger 
equation of 
operator 
evolution 


Starting the discussion of Eqs. ( 1 57)-( 158), let us first consider the case of a system described by 
a time-independent Hamiltonian, whose eigenstates a„ and eigenvalues E„ obey Eq. (68), 32 



(4.159) 


and hence are also time-independent. (Similarly to the wavefunctions y/ n defined by Eq. (1.60), a n are 
called the stationaiy states of the system.) Let us use Eqs. ( 1 57)-( 1 59) to calculate the law of time 
evolution of the expansion coefficients a n , defined by Eq. (118), in the stationary state basis: 


31 Moreover, we will be able to derive Eq. (1.25) from Eq. (154) - see Sec. 5.2. 

32 Here I intentionally use index n rather than j to emphasize the special role played by the special role of the 
stationary eigenstates a„ in quantum dynamics. 


Chapter 4 


Page 28 of 42 


Essential Graduate Physics 


QM: Quantum Mechanics 


Time 
evolution 
of stationary 
states 


Pauli 

Hamiltonian: 

operator 


Pauli 
Hamiltonian: 
z-basis matrix 


a„( 0 


| ait)) = -^(a„ |M(f,f 0 )|a(f„)) = (a n |u(M 0 )|a(*o)) 

{ a „ \—Hu(t,t 0 )\a(t 0 )) = ~^{a n \u(t,t 0 )\a(t 0 )) = |a(f)) 


— E a . 

+. n n 

n 


(4.160) 


This is the same simple equation as Eq. (1.59), and its integration yields a similar result - cf. Eq. (1.61), 
just with the initial time to rather than 0: 



(4.161) 


In order to illustrate how does this result work, let us consider spin- 1/2 dynamics in a time- 
independent, uniform external magnetic field 3, taking its direction for axis z. To construct the system’s 
Hamiltonian, we may apply the correspondence principle to the classical expression for the energy of a 
magnetic moment m in the external magnetic field 3 , 33 


U = -m • 3 . 


(4.162) 


In quantum mechanics, the operator corresponding to the moment m is given by Eq. (116) (suggested by 
W. Pauli), so that the spin-field interaction is described by the so-called Pauli Hamiltonian : 


H = -m 3 = -jS-3 = -y3>S z , 


(4.163) 


where S_ is the operator of the z-component of electron’s spin. According to Eq. (117), in the z-basis of 
states T and -l, the matrix of operator (163) is 



no. 

a 

2 


z ? 


with £2 = y3. 


(4.164) 


The constant Q so defined coincides with the classical frequency of the precession of a symmetric top, 
with an angular momentum S and magnetic moment m = ^S, about axis z, induced by external torque x 
= mx?: 34 


Q = 


r 

~S 



(4.165a) 


For an electron, with its negative gyromagnetic ratio y e = -g e el2m e , neglecting the minor difference 
between factors g e and 2, we get 

n = -—3, (4.165b) 

m e 


i.e. the frequency’s magnitude coincides with that of the cyclotron frequency co c - see Eq. (3.48). 

In order to apply the general Eq. (161), at this stage we would need to find the eigenstates a„ and 
eigenenergies E n of our Hamiltonian. However, with our (smart :-) choice of the direction of axis z, the 
Hamiltonian matrix is already diagonal: 


33 See, e.g., EM Eq. (5.100). As a reminder, we have already used this expression for the derivation of Eq. (3). 

34 See, e.g., CM Sec. 6.5, in particular Eq. (6.72), and EM Sec. 5.5, in particular Eq. (5.114) and its discussion. 
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m(-\ o' 
~T[o +1/ 


(4.166) 


meaning that T and X are the eigenstates of the system, with eigenenergies, respectively, 



and E \ 


no 

+ . 

2 


(Note that their difference, 


(4.167) 


Spin-V 2 in 
magnetic 
field: 
eigen- 
energies 


AE = \E r -E i \ = h\Q\ = h\y3\, (4.168) 

corresponds to the classical energy 2\m^ of flipping the magnetic dipole with moment m = yhl 2 , 
oriented along the direction of field ®. 35 ) With that, Eq. (161) immediately yields following expressions 
for the time evolution of the expansion coefficients: 


a^(t) = 0 ^( 6 ,) exp 


1 

2 



a i (t) = a l (t 0 )exp\- 



(4.169) 


allowing a ready calculation of time evolution of the expectation values of any observable. 

In particular, we can calculate the expectation value of S : as a function of time by applying Eq. 
(130) to an arbitrary time moment t : 


&Xo-f 


aJt)aJt)-aAt)aUt) 


a+(0)at ( 0 ) - a, (0)a* ( 0 ) 


= {s z )( 0 ). (4.170) 


Thus the expectation value of the spin component parallel to the applied magnetic field remains 
constant, regardless of the initial state of the system. However, this is not true for the components 
perpendicular to the field. For example, Eq. (132), applied to moment t, gives 


{s x )(t) = 


n 

2 





n 

2 


a 


(°K (°) 




+ «4,(°)«T (°) 


o) 


. (4.171) 


Clearly, this expression describes sinusoidal oscillations with frequency (165). The amplitude 
and phase of these oscillations depend on initial conditions. Indeed, solving Eqs. (132)-(133) for the 
expansion coefficient products, we get relations 

ha i (t)a*(t) = ( s ]! )(t)-i(S y )(t), ha r {t)al(t)=(S x )(t) + i(S y )(t) (4.172) 

valid for any time t. Plugging their values for t = 0 into Eq. (17 1), we get 


iD.(t-t 0 ) + 1 


{^XO = i[(s,)(o)+i(s,)(o) 

= (S x )(0)cos Ot + (s } , }(o)sin Q t . 
An absolutely similar calculation using Eq. (133) gives 


i[(^xo)-^)(o); 


-in(t-t 0 ) 


(4.173) 


35 Note also that if the product is positive, so is Q, so that E\ is negative, while Ei is positive. This is in the 
correspondence with the classical picture of a magnetic dipole m having negative potential energy when it is 
aligned with the external magnetic field 3 - see Eq. (162). 
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(s y )(t) = (S y )(o)cosnt-{S x )(o)sinnt . (4.174) 

These formulas show, for example, if at moment t = 0 the spin’s state was T, i.e. (S x )( 0) = (S y )( 0) 
= 0, then the amplitude of oscillation of the both “lateral” component of spin vanishes. On the other 
hand, if the spin was initially in state — >, i.e. had the definite, maximum possible value of S x , equal to 
fit 2 (in classics, we would say “the spin fill was oriented in direction jc”), then both expectation values 
(S x ) and (Sy) oscillate in time 36 with this amplitude, with the phase shift nil between them. These 
formulas may be interpreted as the torque-induced precession of the Cartesian components of the spin 
vector of length S = fill, confined in plane [x, y ], with classical frequency Q = yS> about axis z 
(counterclockwise if y&> 0). 

Thus, the gyromagnetic ratio is just the angular frequency of the torque-induced precession of 
spin (about field’s direction) per unit magnetic field; for electrons, \y e \ « 1.761xlO n s _1 T , for protons, 
the ratio is much smaller because of their larger mass: y p ~ 2.675x10' s' T" , and for larger spin-Vi 
nuclei, /may be much smaller still - e.g., 8.681xl0 6 s ’T' 1 for the 57 Fe nucleus. 37 

Note, however, that this classical language does not describe large quantum-mechanical 
uncertainties of these observables, which are absent in the classical picture of the precession - at least 
when it starts from a definite orientation of the angular momentum vector. 

Now let us return to the discussion of the general Schrodinger equation (158) and prove the 
following fascinating fact: it is possible to write the general solution of this operator equation. In the 
easiest case when the Hamiltonian is time-independent, this solution is an exact analog of Eq. (161), 

u (t, t 0 ) = u (t 0 , t o ) expj- ^ H(t - 1 0 )} = / expj- ^H(t-t 0 )\. (4.175) 


To start its proof we should, first of all, understand what does a function (in this case, the exponent) of 
an operator mean. In the operator (and matrix) algebra, such functions are defined by their Taylor 
expansions; in particular, Eq. (175) means that 


„ °° 1 [ j „ 


= 1 + 


( n 


r _0 

l h) 

i 2^ 



H\t-t 0 y- + 


u o 


(4.176) 


3! 


H\ t -t 0 y+..., 


V nj 


where H = HH, n = ////// , etc. Working with such infinite series of operator products is not as hard 
as one could imagine, due to their regular structure. For example, let us differentiate Eq. (176) over t : 


36 This is one more (hopefully, redundant :-) illustration of the difference between averaging over the statistical 
ensemble and over time: in Eqs. (170), (173)-(174), and quite a few relations below, only the former averaging 
has been performed, so the results are still functions of time. 

37 Such composite particles as nuclei (and, from the point of view of high-energy physics, even such hadrons as 
protons) may be characterized by a certain net spin (and hence by certain y) only if during the considered process 
their internal degrees of freedom remain in a certain (usually, ground) quantum state. 
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- 1 

u(t,t 0 ) = 0 + - 


V nj 


^ - 1 

f 0 

2 -2,, . 1 

f A 

H + — 


H 2(t — t 0 ) + — 


J 21 

v hj 

3! 

V ft) 


H 2 3(t-t 0 ) 2 + ... 


r i_ } 
v A 
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1 + 


if a 


1! 


v A 




+ ■ 


if a 


2 ! 


v A 


H 2 (t-t 0 y 


(4.177) 


+ ... = -—Hu(t,t 0 ), 
n 


so that the differential equation (158) is indeed satisfied. On the other hand, Eq. (175) also satisfies the 
initial condition 


u(t 0 ,t 0 ) = u '(t 0 ,t 0 ) = Z 


(4.178) 


which immediately follows from the definition (157) of the evolution operator, so it is indeed the 
(unique) solution for the time evolution operator - in the Schrodinger picture. 

Now let us allow operator H to be a function of time, but with the condition that its “values” (in 
fact, operators) at different instants commute with each other: 

H(t'),H(t") J= 0, for any t',t” . (4.179) 

(An important example of such a Hamiltonian is that of a particle under the effect of a classical, time- 
dependent force F (t): 

H p =-F(t)-r. (4.180) 


Indeed, the radius-vector operator r does not depend explicitly on time and hence commutes with itself, 
as well as with c-numbers F(/ ’) and F(/”).) In this case it is sufficient to replace, in all above formulas, 
product H{t -t 0 ) with the corresponding integral over time; in particular, Eq. (175) is generalized as 


i)(f,f 0 ) = /ex p< 

1 

J 5 * 1 

' H(t')dt ' 



(4.181) 


This replacement means that the first form of Eq. (176) should be replaced with 


u(t,t 0 W + Z 7 /— 

k = 1 £'v 



f i y 

Vo ) 



i 

h 


k t t t 

J dt y J dt 2 ...J dt k H{t x )H{t 2 )...H{t k ). (4.182) 

t 0 t 0 t o 


The proof that the first form of Eq. (182) satisfies Eq. (158) is absolutely similar to the one carried out 
above. 


We may now use Eq. (181) to show that the time-evolution operator is unitary at any moment, 
even for the time-dependent Hamiltonian. Indeed, from that fonnula, 


it (t, t 0 )u \t,t 0 ) = I exp j - — J H{t')dt' 1/ exp j + — J H(t”)dt” 1 . 


(4.183) 


Since each of the exponents may be presented with the Taylor series (182), and, thanks to Eq. (179), 
different components of these sums may be swapped at will, expression (183) may be manipulated 
exactly as the product of c-number exponents, in particular rewritten it as 


Evolution 

operator: 

explicit 

expression 
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u(t,t 0 )u ] (t,t 0 ) = I ex p 


i i 


= / exp {6} = /. 


(4.184) 


This property ensures, in particular, that the system state’s normalization does not depend on time: 

(a(t)\a(t)) = (, a(t 0 )\u \t,t 0 )u(t,t 0 )\a(t 0 )) = (a(t 0 )\a(t 0 )) . (4.185) 


The most difficult cases for the explicit solution of Eq. (158) are those when Eq. (179) is 
violated. 38 It may be proven that in these cases the integral limits in the last fonn of Eq. (182) should be 
truncated, giving the so-called Dyson series 


\ k t 


« (6 h ) = i + s ^ J J dt x J dt 2 . . . . . J dt k H (t ] )H{t 2 )...H(t k ). 


(4.186) 


Since we would not have time to use this relation in our course, I will skip its proof. 39 

Let me now return to the general discussion of quantum dynamics to outline its alternative, 
Heisenberg picture. For that, let us recall that according to Eq. (125), in quantum mechanics the 
expectation value of any observable A is a long bra-ket. Below we will see that other quantities (say, the 
rates of quantum transitions between pairs of different states, say a and /?) may also be measured in 
experiment; the most general form for all such measurable quantities is the following long bracket: 

{a\A\p). (4.187) 


As has been discussed above, in the Schrodinger picture the bra- and ket-vectors of the states are time- 
dependent, while the variable operators stay constant (if the corresponding variables do not explicitly 
depend on time), so that Eq. (187), applied to moment t, may be presented as 

{a(t)\A s \m), (4-188) 


where index “S” emphasizes the Schrodinger picture. Let us apply to the bra- and ket-vectors in this 
expression the evolution law (157): 

{a \A\p) = {a(t 0 ) | u T (t, t 0 )A s u(t, t 0 )| 0(t o )}. (4.189) 


This equality means that if we form a long bracket with bra- and ket-vectors of the initial-time states, 
together with the following time-dependent Heisenberg operator 40 


Heisenberg 

operator 


A h (t) = u'(t,t 0 )A s u(t,t 0 ) = U* (t, f o )A h (t 0 )u(t,t 0 ), 


all experimentally measurable results will remain the same as in the Schrodinger picture: 


(4.190) 


Long bracket 
in the 
Heisenberg 
picture 


(a\A\p) = (a(t 0 )\A n (M 0 )|/?(f 0 ))- 


(4.191) 


38 We will run into such situations in Chapter 7, but will not need to apply Eq. (186). 

39 It may be found, for example, in Chapter 5 of J. Sakurai’s textbook - see References. 

40 Note this relation is similar in structure to the symbolic Eqs. (94). 
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Let us see how does the Heisenberg picture work for the same simple (but very important!) 
problem of the spin -Vi precession in a z-oriented magnetic field, described (in the z-basis) by the 
Hamiltonian matrix (164). In that basis, Eq. (158) for the time-evolution operator reads 


ih 

u n u u ^ 

flfl 

f-i 

o N 

U\\ 


flfl 

u n 

U \2 


v^21 ^22 ) 

2 

v° 

+ 1 v 

y U 2\ 

U 22 J 

2 

V W 21 

U 22 J 


(4.192) 


We see that in this simple case the equations for different matrix elements of the evolution operator 
matrix are decoupled, and readily solvable, using the universal initial condition (178): 41 


u(f,0) 


e intl2 0 


- iflt / 2 


J 


fit . fit 

Icos hzo_ sin — . 

2 - 2 


(4.193) 


Now we can use Eq. (190) to find the Heisenberg-picture operators of spin components. Dropping index 
“H” for brevity (the Heisenberg-picture operators are clearly marked by their dependence on time 
anyway), we get 


s,(0 = u t (<,0)S I (0)u(f,0) = |ut(».0) a ,u((,0) 


n 

2 

h 

2 


— iflt / 2 


0 e 
- iflt N 
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0 

iflt / 2 
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0 
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0 e 
A fit 


- f l\ 

2 G ' 


n 

^ iflt/ 2 

0 

0, 

v 0 

— iflt / 2 

e J 


cosQf + o v sinQf] = S v (0)cosQf + S y (0)sinQf . 


(4.194) 


Absolutely similar calculations of the other spin components yield 


s,v>=A 


r • - iflt A 


te 


0 —le 
iflt 


0 


J 


~\p y cos fit -a x sinQrjs S v (0)cosQf -S t (0)sinQ? , 


s ,(0 = f 


1 0 
0 -1 


= ^o,=sm- 


(4.195) 

(4.196) 


A practical advantage of these formulas is that they describe system’s evolution for arbitrary 
initial conditions, thus making the analysis of the initial state effects very simple. Indeed, since in the 
Heisenberg picture the expectation values of observables are calculated using Eq. (191) (with J3 = a), 
with time-independent bra- and ket vectors, such averaging of Eqs. (1 94)-( 1 96) immediately returns us 
to Eqs. (170), (173), and (174), obtained in the Schrodinger picture. Moreover, these equations for the 
Heisenberg operators formally coincide with the classical equations of the torque-induced precession for 
c-number variables. (In the next chapter, we will see that the same exact mapping is valid for the 
Heisenberg picture of the orbital motion.) 


41 We could of course use this equation result, together with Eq. (157), to obtain all the above results for this 
system within the Schrodinger picture. In our simple case, the use of Eqs. (161) for this purpose was more 
straightforward, but in some cases (e.g., for time-dependent Hamiltonians) an explicit calculation of the time- 
evolution matrix may be the only practicable way to proceed. 
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In order to see that the last fact is by no means a coincidence, let us combine Eqs. (158) and 
(190) to form an explicit differential equation of the Heisenberg operator evolution. For that, let us 
differentiate Eq. (190) over time: 



du ^ „ „+ dA s „ „ + - du 

+ u 1 — —u + u 1 A q — . 

dt 8t dt 


(4.197) 


Plugging in the derivatives of the time evolution operator from Eq. (158) and its Hennitian conjugate, 
and multiplying both parts of the equation by ih, we get 


.. d - + dA s „ A 

ih — A„ = —u HA^u + u ” ' 

dt H s dt 


u + u A s Hu . 


(4.198a) 


If for the Schrodinger-picture Hamiltonian the condition similar to Eq. (179) is satisfied, then, according 
to Eqs. (177) or (182), the Hamiltonian commutes with the time evolution operator and its Hermitian 
conjugate, and may be swapped with any of them. 42 Hence, we may rewrite Eq. (198a) as 


d 


t SA s „ „ i 

a i / _l 7%, i ' — — ^ -|- n i 


ih — A,, = -Hu A cii + ihu 

dt s 8t 


A s uH 


= ihu Jf ^ 


dt 


u + 


u ^ A s u ,H 


(4.198b) 


Now using the definition (190) again, for both terms in the right-hand part, we may write 


Heisenberg 
equation 
of motion 


d ~ 

ih — A H = ifi 
dt H 


/ /V \ 

' dA^ 


dt 


j + 


A h ,H 


(4.199) 


This is the so-called Heisenberg equation of motion 43 

Let us see how does this equation look for the same problem of spin-Vi precession in a z- 
oriented, time-independent magnetic field, described in the z-basis by the Hamiltonian matrix (164), 
which does not depend on time. In this basis, Eq. (199) for the vector operator of spin reads 44 


ih 

fs H 

s 'l 
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_ hQ 
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(4.200) 


Once again, the equations for different matrix elements are decoupled, and their solution is elementary: 


S n (^) = S n (0) =const, S 22 (f) = S 22 (0) = const, 
S 12 (0 = S 12 (Oy Q ', S 21 (f) = S 21 (0)e- /Q? . 


(4.201) 


/V "I* /V /V /V 

42 Due to the same reason, H h = u H s u = u uH s = H s ; this is why the index of the Hamiltonian operator may 
be dropped in Eqs. (198)-(199). 

43 Reportedly, this equation was derived by P. A. M. Dirac, who was so generous that he himself gave the name 
of his colleague to this key result, because “Heisenberg was saying something like this”. 

44 Using commutation relations (155), this equation may be readily generalized to the case of arbitrary magnetic 
field 3(f) and arbitrary state basis - the exercise highly recommended to the reader. 
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According to Eq. (190), the initial “values” of the Heisenberg-picture matrix elements are just the 
Schrodinger-picture ones, so that using Eq. (1 17) we may rewrite this solution in either of two forms: 
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(4.202) 


The simplicity of the last expression is spectacular. (Remember, it covers any initial conditions, 
and all 3 spatial components of spin!) On the other hand, for some purposes the former expression may 
be more convenient; in particular, its Cartesian components immediately give our earlier results (194)- 
(196). 

One of advantages is that the Heisenberg picture is that it provides a more clear link between the 
classical and quantum mechanics. Indeed, analytical classical mechanics may be used to derive the 
following equation of time evolution of an arbitrary function A(qj, p h t) of generalized coordinates and 
momenta of the system, and time: 45 


where H is the classical Hamiltonian function of the system, and is the so-called Poisson bracket 
defined, for two arbitrary functions A(q h p h t) and B(q h p h t), as 


U*}=I 

j 


SA SB 
. dp , dqj 


8 A 8B ' 

dq, dpj^ 


(4.204) 


Comparing Eq. (203) with Eq. (199), we see that the correspondence between the classical and quantum 
mechanics (in the Heisenberg picture) is provided by the following symbolic relation 46 



(4.205) 


45 See, e.g., CM Eq. (10.17). Also, please excuse my use, for the Poisson bracket, the same (traditional) symbol 

as for the anticommutator. We will not run into the Poisson brackets again in the course, leaving very 
little chance for confusion. 

46 Since we have run into the commutator of Heisenberg-picture operators, let me note emphasize again that the 
“values” of such an operator at different moments of time often do not commute. Perhaps the simplest example is 

the operator jt of coordinate of a free ID particle, with Hamiltonian H = p~ 12 m . Indeed, in this case Eq. (199) 


yields equations itix = [x, H = itip / m and itip = [/x fl ] 
motion of the corresponding observables): pit) = const = j)(0) 

[x(o),x(f)] = [x(0), /)(0)]f / m = [x s ,/) s ]/ 7 m = iht / m x 0, if t ■*- 0 . 


= 0 , with simple solutions (similar to those for classical 
x{t ) = x(0) + p(0)tl m , so that 


Poisson 

bracket 


Classical 

vs. 

quantum 

mechanics 
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This relation may be used, in particular, for finding appropriate operators for system’s observables, if 
their form is not immediately evident from the correspondence principle. We will develop this 
argumentation further in the next chapter where we revisit the wave mechanics, and also in Chapter 9. 

Finally, let us discuss one more alternative picture of quantum dynamics. It is also attributed to 
P. A. M. Dirac, and is called either the “Dirac picture”, or (more frequently) the interaction picture. The 
last name stems from the fact that this picture is very useful for the perturbative (approximate) 
approaches to systems whose Hamiltonians may be partitioned into two parts, 

H = H 0 +H mt , (4.206) 

where H 0 is the sum of relatively simple Hamiltonians of non-interacting component sub-systems, 

while their second tenn in Eq. (206) represents their weak interaction. (Note, however, that the relations 
in the balance of this section are exact and not based on these assumptions.) In this case, it is natural to 
consider, together with the genuine unitary operator u(t,t 0 ) of the time evolution of the system, which 
obeys Eq. (158), a similarly defined unitary operator of evolution of the “unperturbed system” described 
by Hamiltonian H 0 alone: 

ihu 0 =H 0 u 0 , (4.207) 

and also the following interaction evolution 

Interaction 
evolution 
operator 

The sense of this definition becomes 


and its Hermitian conjugate, 

= (u 0 u 7 = u]ul , (4.210) 

into the basic Eq. (190) - which is valid in any picture: 

{a\A\P) = {a(t Q )\tf (t,t Q )A s u(t,t Q )\p{t a )) = {a{t 0 )\Ci}(t,t Q yil{t,t Q )A s u 0 {t,t Q yi I {t,t 0 \p{t Q )) . (4.211) 

This relation shows that all calculations of the observable expectation values and transition rates 
(i.e. all the results of quantum mechanics that may be experimentally verified) are expressed by the 
following fonnula, with the standard bra-ket structure (187), 


operator, 

-f - 

= Uq i 

more clear if we insert the reciprocal relation, 

)t,' 


(4.208) 


(4.209) 


{a\A\p) = {a I (t)\A I {t)\/3 I (t)), 


(4.212) 


Interaction 
picture: 
state vectors 


Interaction 

picture: 

operators 


if we assume that both the state vectors and operators evolve in time, with the vectors evolving due to 
the interaction operator u , , 


(«/ (0 1 = (a(t 0 ) | «/ (fi *0 )> | Pi (0) = « / (t, t 0 )\fi(t 0 )), 


(4.213) 


while the operators’ evolution being governed by the unperturbed operator u 0 : 


^i(t) = u}(t,t 0 )A s U 0 (t,t 0 ). 


(4.214) 
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These relations describe the interaction picture of quantum dynamics. Let me defer an example 
of its convenience until the perturbative analysis of open quantum systems in Sec. 7.6, and here end the 
discussion with a proof that the interaction evolution operator satisfies the Schrodinger equation, 

ihu, =H,Uj, (4.215) 


in which H , is the interaction Hamiltonian transformed in accordance with rule (214): 


H,(() = ul(tJ () )H ml u 0 (t,t 0 ). 


(4.216) 


The proof is very straightforward: first using definition (208), and then Eqs. (158) and the Hermitian 
conjugate of Eq. (207), we may write 


d Lp 


ihu , = ih — IwJ u I = ihluju + Uqu)— -H 0 u}u + u} Hu = -H 0 uju + u J (// 0 + H ml Ju 
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(4.217) 


it, 


*0 int 


^4* /V ^ /V 

Since u () may be presented as an integral of H 0 (similar to Eq. (181) relating u and H ), these operators 
commute, so that the parentheses in the last form of Eq. (217) vanish. Now plugging u from Eq. (209), 
we get the equation, 


ihu j 


int w 0 ‘-V 


Imq H 1vA Uq 



(4.218) 


that is equivalent to the combination of Eqs. (215) and (216). 

Equation (215) shows that if the energy scale of interaction H ml is much weaker than the 
background energy H 0 , operators u, and u] , and hence the state vectors (213) evolve relatively slowly. 
Such an exclusion of fast background oscillations is especially convenient for the perturbative 
approaches to complex interacting systems, in particular to the open quantum systems that weakly 
interact with their environment - see Sec. 7.6. 


4.7. Exercise problems 


4.1 . Let a and /? be two possible quantum states of the same system, and A be a linear operator. 
Which of the following expressions are legitimate (i. e. have a well-defined meaning) within the bra-ket 
formalism? 


1. (a 


2. (a p 


3. \a)(p\ 


4. A 


5. A 


6. (a A 


7. a A 


8 . a 


9. A 


10 . (a 


4.2 . Prove that if A and B are linear operators, then: 
(i) (a J = A ; (ii) (/dj = -iA f ; 

A A 4. A J. A 

(iv) operators AA ' and A ' A are Hermitian. 


(iii> ('W 
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4.3 . Prove that for any linear operators A,B,C,D, 

\ab,cd 


' 0 

0 

, as 

(0 


, <J^ = 

fl 

0 " 

vl 

0 J 

5 y 

A 

0 J 

7 z 

^0 

-ij 


= A{B, c\d - Ac{b,d)+ {a, c}dB - C\A,D)B . 


4.4 . Calculate all possible binary products op y ■ (for j, j’ = x, y, z) of the Pauli matrices (105), 


o„ 


and their commutators and anticommutators (defined similarly to those of the corresponding operators). 
Present the results using the Kronecker delta and Levi-Civita pennutation symbols. 47 

4.5 . Calculate the following expressions, 

(i) (c-g)", and then 

(ii) (bl + c-g)'', 

for the scalar product c-g of the Pauli matrix vector a = n v a v + n y o y + n : o z by an arbitrary c-number 
vector c, where n > 0 is an integer, and b is an arbitrary scalar c-number. 

Hint'. For task (ii), you may like to use the binomial theorem, 48 and then transform the result in a 
way enabling you to use the same theorem backwards. 

4.6 . * Use the results of the previous problem to derive Eqs. (2.165)-(2.166) for the transparency 
T of a system of N similar, equidistant, delta-functional tunnel barriers. 

4.7 . Use result of Problem 5 to spell out the following the following matrix: cxpj/6h-aj, where 
a is the vector of Pauli matrices, n is a c-number vector of unit length, and 6 is a c-number scalar. 

4.8 . Use the result of Problem 5(ii) to calculate exp {A}, where A is an arbitrary 2x2 matrix. 

4.9 . Express elements of matrix B = exp {A} explicitly via those of the 2x2 matrix A. Spell out 
your result for the following matrices: 

A = 


'a 

a ^ 

, A' = 

/ cp 

iq/ 

\ a 

a j 


,i(P 

i(p , 


with real a and (p . 

4.10 . Prove that for arbitrary square matrices A and B, 

Tr(AB) = Tr(BA) . 

Is each diagonal element (AB) U necessarily equal to (BA)/! 


47 See, e.g., MA Eqs. (13.1) and (13.2). 

48 See, e.g. MA Eq. (2.9). 


Chapter 4 


Page 39 of 42 





Essential Graduate Physics 


QM: Quantum Mechanics 


4.1 1 . Prove that the matrix trace of an arbitrary operator does not change at an arbitrary unitary 
transfonnation. 


4.12 . Prove that for any two full and orthononnal bases uj, Vj of the same Hilbert space, 



4.13 . Is the ID scattering matrix S, defined by Eq. (133), unitary? What about the ID transfer 
matrix T defined by Eq. (134)? 

4.14 . Calculate the trace of the following matrix: 

exp {/a • o jcxpj/b • a}, 

where a is the Pauli matrix vector, while a and b are usual (c-number) geometric vectors. 


4.15 . Let Aj be eigenvalues of some operator A . Express the following two sums, 

j j 

via the matrix elements Ajj ■ of this operator in an arbitrary basis. 

4.16 . Calculate (cr_ ) of a two-level system in a quantum state with the following ket-vector: 

| oc 'j = const x (j T ^ + 1 4'^ + 1 — ^ + 1 ^ ^ j, 

where (T, 4) and (— >, <— ) are eigenstates of the Pauli matrices <j z and a x , respectively. 

Hint : Double-check whether the solution you are giving is general. 


4.17 . An electron is fully polarized in the positive z-direction. Calculate the probabilities of the 
alternative outcomes of a perfect Stem-Gerlach experiment with the magnetic field B oriented in the 
direction of some axis n, perfonned on this electron. 

4.18 . A perfect Stern-Gerlach instrument makes a single-shot measurement of the following 
combination, (S x + S Z )N2, of two spin components of a z-polarized electron; after that, component S z of 
the same particle is measured. What are the possible outcomes of these measurements and their 
probabilities? 


4.19 . In a certain basis, the Hamiltonian of a spin -Vi (two-level) system is described by matrix 


H = 



with E i ^ E 2 , 


and the operator of some observable A, by matrix 

A = 


fl 

vl 


r, 

i/ 
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For the system’s state with the energy equal exactly to E\, find the possible results of measurements of 
observable A and the probabilities of the corresponding measurement outcomes. 

4.20 . * States u 1 , 2,3 form an orthonormal basis of a system with Hamiltonian 

H = -S^u^(u 2 1 + 1 u 2 )(w 3 1 + |)+ h.c., 

where fids a real constant, and h.c. means the Hennitian conjugate of the previous expression. Calculate 
its stationary states and energy levels. Can you relate this system with any other(s) discussed earlier in 
the course? 

4.21 . Suggest a Hamiltonian describing particle’s dynamics in an infinite ID set of similar 
quantum wells in the tight-binding approximation, in the bra-ket formalism, and verify that is yields the 
correct dispersion relation (2.206). 

4.22 . Calculate eigenvectors and eigenvalues of the following matrices: 


"0 




ro 
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0 
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0, 


4.23 . Find eigenvalues of the following matrix: 

A = aa = a x a x + a y o y +a z c z , 

where a Xi y :Z are real c-numbers (scalars), and g xv , z are the Pauli matrices. Sketch the dependence of the 
eigenvalues on parameter a : , with a x and a y fixed. Compare the result with Fig. 29. 

4.24 . Derive a differential equation for the time evolution of the expectation value of an 
observable, using both the Schrodinger picture and the Heisenberg picture of quantum mechanics. 

4.25 . At t = 0, a spin- 14 particle, whose interaction with an external field is described by 
Hamiltonian 

H = a • 6 = a x a x + a y a y + a z d z , 

(where a X: y yZ are real and constant c-numbers, and are the operators that, in the z-basis, are 

represented by the Pauli matrices g zj , z ), was in state T, one of two eigenstates of operator <j z . Use the 
Schrodinger picture equations to calculate the time evolution of: 

(i) the ket-vector I a) of the system (in any stationary basis you like), 

(ii) the probabilities to find the system in states T and -i, and 

(iii) the expectation values of all 3 spatial components (S\.,etc.) of the spin vector operator 
S = (h/2)a . 

Analyze and interpret the results for the particular case a y = a z = 0. 
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4.26 . For the same system as in the previous problem, use the Heisenberg picture equations to 
calculate the time evolution of: 

(i) all three spatial components (S x , etc.) of the spin operator S H (t), 

(ii) the expectation values of the spin components. 

Compare the latter results with those of the previous problem. 

4.27 . For the same system as in two last problems, calculate the matrix elements of operator a z 
in the basis of eigenstates a\, a-i. 

Hint : In contrast to the cited problems, the answer evidently does not depend on the initial 
conditions. 


4.28 . In the Schrodinger picture of quantum mechanics, three operators satisfy the following 
commutation relation: 



C. 


What is their relation in the Heisenberg picture (at the same time instant)? 


4.29 . A spin-Vi particle is placed into a magnetic field ®(/), which is an arbitrary function of 
time. Derive the differential equations describing the time evolution of: 

(i) the vector operator S of particle’s spin (in the Heisenberg picture), and 

(ii) the expectation value (S) of spin’s vector. 

Contemplate the relative merits of the latter equation for the description of a single spin and of a large 
collection of similar, non-interacting spins. 

4.30 . * Prove the Bloch theorem given by either Eq. (3.107) or Eq. (3.108). 

Hint: Consider the translation operator T K , defined by the following result of its action on an 
arbitrary functional - ): 

f R f(r) = f(r + R), 

where R is an arbitrary vector of the Bravais lattice (3.106). In particular, analyze the commutation 
properties of the operator, and apply them to an eigenfunction y/{ r) of the stationary Schrodinger 
equation for a particle in a 3D periodic potential described by Eq. (3.105). 
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Chapter 5. Some Exactly Solvable Problems 

This describes several simplest but important applications of the bra-ket formalism, notably including a 
few wave-mechanics problems we have already started to discuss in Chapters 2 and 3. 


5.1. Two-level systems 

In the course of discussion of the bra-ket fonnalism in the last chapter, we have already 
considered several examples of how it works for electron’s spin. We have seen, in particular, that in 
magnetic field the electron has eigenenergies (4.167), i.e. two energy levels. As will be shown later in 
the course, such two-energy-level picture is valid not only for electrons and other spin- 14 elementary 
particles (such as muons and neutrinos), but also may give a good approximation for other important 
quantum systems. For example, as was already mentioned in Chapter 2, two energy levels are sufficient 
for an approximate description of dynamics of two weakly coupled quantum wells (Sec. 2.6), and of 
level anticrossing in the weak-potential approximation of the band theory (Sec. 2.7). Such two-level 
systems (alternatively called “spin- /-Alike” systems) are nowadays the focus of additional attention in 
the view of prospects of their possible use for information processing and encryption. (In the last 
context, to be discussed in Sec. 8.5, a two-level system is usually called a qubit.) 

This is why before proceeding to other problems, let us summarize in brief what we have already 
learned about properties and dynamics of two-level systems, in a more convenient language. According 
to the general Eq. (4.6), a ket- (or bra-) vector of an arbitrary pure (coherent) state a of such a system 
may be presented, at any instant, as a linear combination of two basis vectors, for example 

| a) = a^ |T) + a^), (5.1) 


and hence is completely described by two complex coefficients (c -numbers) - say, op and ay These two 
numbers are not completely arbitrary; they are restricted by the normalization condition. If the basis 
vectors are normalized, then to have the system in some basis state with a 100% probability, we need 


W s ={a a) = 


((T|a* +(^|a*)(a t |T) + « i |l)) = 


= + a, a, = a t + a , =1. (5.2) 


This requirement is automatically satisfied if we take the moduli of a\ and ai equal to the sine and 
cosine of the same (real) angle. Thus we can write, for example, 


6 

a t = cos — e 


iy 


a i = sin — e 


6 [f(r+<p ) 
2 


(5.3) 


Moreover, according to the general Eq. (4.125), if we deal with just one system, 1 the common phase 
factor exp {iy} is unimportant for calculation of any expectation values, and we can take y= 0, so that 
Eq. (3) is reduced to 


1 To recall why this condition is crucial, please revisit the beginning of Sec. 2.3. Note also that, in particular, the 
mutual phase shifts between different qubits are very important for quantum information processing (see Chapter 
7 below), so that most discussions of these applications have to start from Eq. (3) rather than Eq. (4). 
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Bloch 
sphere 

(5.4) representation 
of state 

The reason why the argument of sine and cosine functions is usually taken in the form 6! 2, 
becomes clear from Fig. la: Eq. (4) conveniently maps each state a on a certain representation point of 
a unit-radius Bloch sphere , 2 with polar angle 6 and azimuthal angle <p . In particular, state T (with at = 1 
and ai = 0) corresponds to the North Pole of the sphere (6=0), while state -l (with at = 0 and ai = 1), 
to its South pole (6= n). 3 Similarly, states — > and <— , described by Eqs. (4.122), i.e. having at = 1/V2 
and at = ±\N2, correspond to points with 6 = nil and to, respectively, <p = 0 and cp = n. Two more 
special points (denoted in Fig. la as O and ®) are also located on sphere’s equator (at 6= nil and cp = 

±nl2)\ it is easy to check that they correspond to the eigenstates of matrix a v (in the same z-basis). 


6 . 6 i(p 

= cos — , a =sm — 

T 2 1 2 


In order to understand why such mutually perpendicular location of these three special point 
pairs on the Bloch sphere is not occasional, let us plug Eqs. (4) into Eqs. (4. 13 1)-(4. 133) for the 
expectation values of spin components. The result is 



h 

2 


sin 6 cos cp, 



h 

2 


sin 6 sin cp, 


(S-) = | COS#, 


(5.5) 


showing that the radius-vector of the representation point on the sphere is (after multiplication by hi 2) 
just the expectation value of the spin vector S. 



Fig. 5.1. Bloch sphere: (a) notation, and presentation of spin precession in magnetic fields directed 
along: (b) axis z, and (c) axis x. 

Now let us see how does the representation point moves in various cases. First of all, according 
to Eqs. (4. 1 57)-(4. 158), in the absence of an external field (when the Hamiltonian operator is equal to 
zero and hence the time-evolution operator is constant) the point does not move at all. Now, if we apply 
to an electron a magnetic field directed along axis z, then, according to Eqs. (4.202), the Heisenberg 
operator of S z (and hence the expectation value (SI)) remains constant, while angle cp in Eq. (5) evolves 


2 Named after the same F. Bloch who has pioneered the energy band theory that was discussed in Chapters 2-3. 

3 In the quantum information literature, ket-vectors |T) and -i ) of these two states of a qubit are usually denoted as 
|1) (“quantum one”) and |0) (“quantum zero”). 
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in time as fit + const. This means that the torque-induced precession of the spin in a constant field 3 = 
n : 3 is described by a circular rotation of the representation point about axis z (in Fig. lb, in the 

horizontal plane) with the classical precession frequency Cl. This is essentially the classical picture of 
rotation of the angular momentum vector about the precession axis z, with both its length and the z- 
component conserved. 4 

It is straightforward to repeat all calculations of Sec. 4.6 for a field of a different orientation and 
prove the (virtually evident) result that the representation point perfonns a similar rotation about the 
field direction. (Fig. lc shows such rotation for an x-directed field.) Finally, note that it is sufficient to 
turn off the field to stop the precession instantly. (Since Eq. (4.158) is the first-order differential 
equation, the representation point has no effective inertia. 5 ) Hence changing the direction and magnitude 
of the external field, it is possible to move spin’s representation point to any position on the Bloch 
sphere. (In Chapter 6 we will examine another method of manipulating the point position, that is based 
on external rf field and is more convenient for some two-level systems.) 

In the context of quantum infonnation, this means that in the absence of uncontrollable 
interaction with environment, it is possible to prepare a qubit in any pure quantum state, and then keep it 
unchanged. From here it is clear that a qubit is very much different from and a classical bistable system 
used to store single bits of infonnation - such as the voltage state of a usual SRAM cell (a positive- 
feedback loop of two transistor-based inverters). As Eq. (4) shows, qubit’s state is determined by two 
independent, continuous parameters 9 and tp, so it may store much more information than one bit. (The 
difference is even more spectacular in qubit systems, to be discussed in Sec. 8.5.) However, classical 
bistable systems, due to their nonlinearity, are stable with respect to small perturbations, so that their 
operation is rather robust with respect to unintentional interaction with their environment. In contrast, 
qubit’s state may be readily disturbed (i.e. its representation point on the Bloch sphere shifted) by even 
minor perturbations, and does not have an internal state stabilization mechanism. 6 Due to this reason, 
qubit-based systems are rather vulnerable to environment-induced drifts, including dephasing and 
relaxation effects, which will be discussed in Chapter 7. 


5.2, Revisiting wave mechanics 

In order to use the bra-ket formalism for the description of the “orbital” motion of a particle as a 
whole, we have to either rewrite or even modify some of its fonnulas for the case of observables with 
continuous spectrum of eigenvalues. (One example we already know well are the momentum and kinetic 
energy of a free particle.) In that case, all the above expressions for states, their bra- and ket-vectors, and 
eigenvalues, should be stripped of discrete indices, like index j in the key equation (68) that detennines 
eigenstates and eigenvalues of observable^. For that, Eq. (68) may be rewritten in the form 


4 Still, it is crucial to appreciate the difference between the expectation values (5), i.e. c-numbers, and the actual 
observables S x , S y , and S z which are described in quantum mechanics by operators. For example, according to Eq. 
(4.156), for any position on the Bloch sphere, it is impossible to have exact values of Cartesian components, as it 
is in the classical picture. 

5 The same is true for the angular momentum L at the classical torque-induced precession - see, e.g., CM Sec. 6. 5 
and in particular Eq. (6.71). 

6 In this aspect as well, the information processing systems based on qubits are closer to classical analog 
computers rather then classical digital ones. 
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(5.6) 


More essentially, all sums over such continuous eigenstate sets should be replaced by integrals. 
For example, for a full and orthonormal set of eigenstates (6), the closure relation (4.44) should be 
replaced with 


. 

dA 

a A ){ a A 

= i. 


(5.7) 


where the integral should be taken over the whole interval of possible values of observable A. Applying 
this relation to the ket-vector of an arbitrary state a (generally, not an eigenstate of operator A ), we get 


a) = l\a) = 


J dA | a A ){a A | a) = J dA (, 


a A \a)\a 


(5.8) 


This integral replaces sum (4.37) for the representation of an arbitrary ket-vector as an expansion over 
eigenstates of an operator. For the particular case when | a) = | a a) , this relation requires 7 


(a A \ a A ) = S(A-A'y, 


(5.9) 


this formula replaces the orthonormality condition (4.38). 


According to Eq. (8), in the continuous case the bra-ket ( a A I a) still plays the role of the 
coefficient whose modulus squared determines state a/s probability - see the last fonn of Eq. (4.120). 
However, in the continuous spectrum case the probability to find the system exactly in a particular state 
is infinitesimal. Instead we should speak about the probability density w(A) oc | (a A \ a) I 2 to find the 
observable within a small interval dA about a certain value A. The coefficient in that relation may be 
found by making the similar change from summation to integration (without any additional coefficients) 
in the normalization condition (4.121): 



a A){ a A\ a ) = l - 


(5.10) 


Since the total probability of the system to be in some state should also equal J w(A)dA , this means that 


w(A ) = (a 

a A )(a A 

a) = | (a 



Now let us see how we can calculate expectation values of continuous observables, i.e. their 
ensemble averages. If we speak about the same observable / whose eigenstates are used as the basis (or 
any compatible observable), everything is simple. Inserting Eq. (1 1) into the general statistical relation 


(A) = J w(A)AdA , 

(5.12) 

which is just the evident continuous version of Eq. (1.37), we get 

(Aj = ^{a\a A )A(a A | a)dA. 

(5.13) 

Presenting this expression as a double integral, 

(/) = J dAj dA '{a 

\a A )A8{A- A')(a A , | a). 

(5.14) 


7 Notice that in the contrast to the discrete spectrum case, the dimensionality of the bra- and ket-vectors so 
normalized is different from 1 . 
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Expectation 

value 


Wave- 
function 
as inner 
product 


and using the continuous-spectrum version of Eq. (4. 100), 

(a A \A\a A ) = A8{A-A'), 


(5.15) 


we may write 

(yf) = J dA J dA ’{a | a A )(a A \A\ a A , )(a A , | a) = (a \A\ a), 

so that Eq. (4.125) remains valid in the continuous-spectrum case without any changes. 


(5.16) 


The situation is a bit more complicated for the expectation values of operators that do not 
commute with the base-creating operator, because the matrix of such an operators in that may not be 
diagonal. We will consider (and overcome :-) this technical difficulty very soon, but otherwise we are 
ready for the discussion of wave mechanics. (For the notation simplicity I will discuss its ID version; 
the generalization to the 2D and 3D cases is straightforward.) 

Let us consider what is called the coordinate representation, postulating the (intuitively almost 
evident) existence of a quantum state basis, whose with ket-vectors will be called |jc), corresponding to a 
certain, exactly defined value x of particle’s coordinate. Writing the following evident identity: 

x|x) = x|x), (5-17) 


and comparing this relation with Eq. (6), we see that they do not contradict each other if we assume that 
x in the left-hand part of this equation is considered as the coordinate operator x whose action on a ket- 
(or bra-) vector is just its multiplication by c-number x. (This looks like a proof, but is actually a 
separate, independent postulate, no matter how plausible.) 

In this coordinate representation, the inner product (a A \a(t)) becomes {x\a(t)), and Eq. (11) takes 
the form 

* 

w(x,t) = (a(t)\x)(x\a(t)) = (x\a(t)) (x|a(Q). (5.18) 


Comparing this formula with the basic postulate (1.22) of wave mechanics, we see that they coincide if 
the Schrodinger’s wavefunction of time-evolving state a(t) is identified with that bra-ket: 8 


v E (Z (x,0 = (x|«(0). 


(5.19) 


This key formula provides the connection between the bra-ket formalism and wave mechanics, 
and should not be too surprising for the (thoughtful :-) reader. Indeed, Eqs. (4.45) shows that any inner 
product of vectors describing two states is a measure of their coincidence - just as the scalar product of 
two geometric vectors. (The orthonormality condition (4.38) is a particular manifestation of this fact.) In 
this language, value (19) of wavefunction v F a at point x and moment t characterizes “how much of a 
particular coordinate x” does the state a contain at that particular instance. (Of course this in formal 
language is too crude to describe the fact that v F a (x, 0 is a complex function, which has not only a 
modulus, but also a phase.) 


8 I do not quite like expressions like (x|T) used in some papers and even textbooks. Of course, one is free to 
replace a with any other letter (T including) to denote a quantum state, but then it is better not to use the same 
letter to denote the wavefunction, i.e. an inner product of two state vectors, to avoid confusion. 
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Let us rewrite the most important formulas of the bra-ket formalism (so far, in the Schrodinger 
picture) in the wave mechanics notation. In particular, let us use Eq. (19) to calculate the (partial) time 
derivative of the wavefunction, multiplied by the usual coefficient ifi\ 

m — z- = ih—(x\a(t)). (5.20) 

8t dr 1 ' 

Since the coordinate operator x does not depend on time explicitly, its eigenstates x are stationary, and 
we can swap the time derivative and the time-independent ket-vector and hence (x|. Making use of the 
Schrodinger-picture equations (4.157) and (4.158), and then inserting the identity operator in the 
continuous form (7) of the closure relation, written for the coordinate eigenstates, 

Jc/x'|x')(x'| = /, (5.21) 


we may continue to develop the right-hand part of Eq. (20) as 

d d 

x\ih—\a(t)) = (x\ih —u (t ,t 0 )\a(t q)} = (x\Hu (t,t 0 )\a(t 0 )} = (x\H\a(t)) 


= J dx' (x\H\x')(x'\a(t))= J dx' (x|i/|x')'P tt (x',Q. 


(5.22) 


For a general Hamiltonian operator, we have to stop here, because if it does not commute with 
the coordinate operator, its matrix in the x-basis is not diagonal, and integral (22) cannot be worked out 
explicitly. However, there exists a broad set of space-local operators 9 whose arguments include just one 
value of the spatial coordinate, for which we can move ket-vector (x| to the right 9 10 


(x|^4|x') v F (x',t) = i¥(x',0(x|x') = AAd (x,t)5(x - x') . 


(5.23) 


Space- 

local 

operators 


where operator A in the last two forms should be understood as its coordinate representation that is 
defined by Eq. (23) - if it is valid for a particular operator. For example, consider the ID version of 
operator (1.40), 


H = 


P 2 

f^ + U(x,t), 
2m 


(5.24) 


which was the basis of all our discussions in Chapter 2. Its potential-energy part commutes with 
operator x , so its matrix in the x-basis is diagonal, meaning that this part of Hamiltonian (24) is clearly 
local, with its coordinate representation given merely by the c-number function U(x,t ). The situation 
with the kinetic energy, which is a function of momentum operator p x , not commuting with x , is less 

evident. Let me show that this operator is also local, and in the same shot derive (the ID version of) Eq. 
(1.26), if we postulate the commutation relation (2.14): 

xp x - p x x = ihl . (5.25) 


9 Of all the operators we will encounter in this course, only the statistical operator w is substantially non-local - 
see Sec. 7.2. 

10 In the second equality, I have use Eq. (9) for variable x. 
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For that, let us consider the following matrix element, (jc xp x - p x x\x'). On one hand, we may 
use Eq. (25) to write 

(x\xp x - p x x\x'i = (x\ihl\x'j = ih(x\x ,s j = ihS(x-x ') . (5.26) 

On the other hand, since x|x') = x'\x') and (x\x = (x|x , we can write 

(5.27) 


(x | xp x - p x x | x') = (x \xp x - p x X '| x ') = (x - x’)(x \p x \x'). 
Comparing Eqs. (26) and (27), we may write 

S(x-x') 


x p x x') = ih 


(x - x') 


Thus p x is a local operator. Since Eq. (28a) may be rewritten as 


li 


(5.28a) 


x|/> t |x') = -ih — £>(x-x’) , 

Ox 

its comparison with Eq. (23) shows that the formula used so much in Chapter 2, 

•ft d 

P X =-^T~> 

ox 


(5.28b) 


(5.29) 


is indeed valid, but only for the coordinate representation of the momentum operator. (Later in this 
section we will see that in an alternative, momentum representation, this operator looks completely 
differently.) 


It is straightforward to show (and virtually evident) that any operator f{p) is local as well, with 
its coordinate representation being 


/ 


-ih — 
dx 


(5.30) 


In particular, this pertains to the kinetic energy operator in Eq. (24), so that Eq. (20) is reduced to the 
Schrodinger equation in its familiar wave-mechanics form (1.28), if by H we mean its coordinate 
representation: 


H = 


2m 


-ih — 
dx 


+ U (x, t) 


h 2 d 1 
2m dx 2 


+ U (x,t) . 


(5.31) 


Now let us return, as was promised, to operators that do not commute with operator x , and 
hence do not have to share its continuous spectrum. Inner-multiplying both parts of the general Eq. 
(4.68) by ket-vector (x|, and inserting into the left-hand part the identity operator in form (21), we get 

| Jx'(x|^4|x')^x'|a^ = Ajix^a^j , (5.32) 


11 The equivalence of the two form s of Eq. (28) may be readily proven, for example, by comparison of their effect 
on any differentiable function fx, x’), using its Taylor expansion over argument x at point x’ = x - a simple but 
good exercise for the reader. 
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i.e., using the wavefunction definition (19), 

\ch!(x\A\x')'¥ jix'J) = AjVjixJ) . (5.33) 

If the operator A is space-local, i.e. satisfies Eq. (23), then this result is immediately reduced to 

(5.34) 

(where the left-hand part implies the coordinate representation of the operator), even if the operator does 
not commute with operator x , 12 The most important case of this coordinate-representation fonn of the 
eigenproblem (4.68) is the familiar Eq. (1.60) for eigenvalues E n of energy. Hence, the energy spectrum 
of a system (that, as we know very well from the first chapters of the course, may be discrete) is nothing 
more than the set of eigenvalues of its Hamiltonian operator - a very important conclusion indeed. 

The operator locality also simplifies the expression for its expectation value. Indeed, plugging 
the completeness relation in the form (21) into the general Eq. (4.125) twice (written in the first case for 
x and in the second case for x ’), we get 

( A ^ = J dxj dx' (a(t)\x}(x\A\x'^{x'\a(t)^ = J dxj dx*¥ a (x,t)(x|H| x') v P a (x',Q . (5.35) 

Now, Eq. (23) reduces this result to just 

(Aj = J dxj dx n ¥ a (x,t)A x ¥ a (x,t)S(x -x') = J (x,t)A x ¥ a (x,t)dx . (5.36) 

i.e. to Eq. (1.23), which we had to postulate in Chapter 1. 

So, we have essentially derived all basic relations of wave mechanics from the bra-ket 
formalism, which will also allow us to get some important new results in that area. Before doing that, let 
us have a look at a pair of very interesting relations, together called the Ehrenfest theorem. In order to 
derive them, let us calculate the following commutator: 13 

[x, pi ] = xp x p x - p x p x x. (5.37) 

Rewriting Heisenberg’s commutation relation (25) as 

xp x = p x x + ih, (5.38) 


A x ¥ j (x,t) = A J x ¥ j (x,t), 


Operator’s 

eigenstates 

and 

eigenvalues 


we can use it twice in the first tenn of the right-hand part of Eq. (37) to sequentially move the 
momentum operators to the left: 


xp 


P, = {pa + ifl )P x = PAP, + ih P x = pSpA + ih)+ihp x = p x p x x + 2 ihp x . (5.39) 


12 In some systems of quantum mechanics postulates, the Schrodinger equation (1.28) itself is considered as a sort 
of eigenstate/eigenvalue problem (34) for operator ihd/dt. Notice that such construct is very close to that of the 
momentum operator -ihd/dx, and similar arguments may be given for both expressions, starting from the 
invariance of the quantum state of a free particle with respect to translations in time and space, respectively. 

13 It is not important whether we speak about the Schrodinger or Heisenberg picture here. Indeed, if three 

operators in the former picture are related as [ A s , B s ] = C s , then according to Eq. (4. 1 90), in the latter picture 


a h ,b h 


UU h U,U'B h U 


■U'A h UU'B h U-U'B h UU'A h U = U' 




u = u } c s u = C H . 


Chapter 5 


Page 8 of 50 


Essential Graduate Physics 


QM: Quantum Mechanics 


The first term of the result cancels with the second term of Eq. (37), so that the commutator is rather 
simple: 

[x,p 2 x \=2ihp x . (5.40) 


Let us use this equality to calculate the Heisenberg-picture equation of motion for operator x , 
applying the general Heisenberg equation (4.199) to the orbital motion, when the Hamiltonian has the 
form (3 1), with time-independent potential U(x): H 


dx 

dt 



1 " 

x,H 



ifi 



+ U(x) . 


(5.41) 


The potential energy operator commutes with the coordinate operator. Thus, the right-hand part of Eq. 
(41) is proportional to commutator (40): 

Heisenberg 
equation 
for 

coordinate 

In that operator equality, we readily recognize the classical relation between particle’s momentum and 
is velocity. 



(5.42) 


Now let us see what does a similar procedure give for the momentum’s derivative: 


d Px 

dt 


iti 


P,,H 


ifi 


P„-^ + U(x) 

2m 


(5.43) 


The kinetic energy operator commutes with the momentum operator, and hence may be dropped from 
the right-hand part of this equation. In order to calculate the remaining commutator of the momentum 
and potential energy, let us use the fact that any smooth potential profile may be represented by its 
Taylor expansion: 


oo 

U{x) = X 


k=0 


1 d k u , k 

k\ 8x k X 


(5.44) 


where the derivatives of U should be understood as c-numbers (evaluated at x = 0), so that we may write 




1 8 k U 


c 


p x xx^x -xx^ocp 

V k times k times 7 


k\ 8x k L J to k\ dx 
Applying Eq. (38) k times to the last term in the parentheses, exactly as we did it in Eq. (39), we get 


(5.45) 


\p x ,U(x)\=~Y J 


1 d k U 


ikhx k 1 = 


00 1 

“Ett 1 


d k U 




k ^k\d. x k tt(k-l )\dx k 

But the last sum is just the Taylor expansion of the derivative 8U/ dx. Indeed, 


(5.46) 


8U 


= Y- 


1 8 k 


8x f^ 0 k'\ dx 


r 8U^ 
V dx j 


= yJ_<r_U_~k< 

>k'\ 8x k ' +1 X 


fjj 

-Z 


1 8 k U 


it- 1 


k'=0 1 


~i(k- 1)! 8x k 


(5.47) 


14 Since this Hamiltonian is time-independent, we may replace the partial derivative over time t with the full one. 
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where at the last step I have replaced the notation of the summation index from k ’ to k - 1 . As a result, 
Eq. (43) yields: 


d P x 

dt 



(5.48) 


This equation again coincides with the classical equation of motion! Discussing spin dynamics in 
Sec. 4.6 and 5.1, we have already seen that this is very typical of the Heisenberg picture. Moreover, 
averaging Eqs. (42) and (48) over the initial state (as Eq. (4.191) prescribes 15 ), we get similar results for 
the expectation values: 16 

= d{p,) _ /du\ (5 49) 

dt m ’ dt \ dx / 

However, it is important to remember that the equivalence between these quantum-mechanical 
equations and similar equations of classical mechanics is superficial, and the degree of the similarity 
between the two mechanics very much depends on the problem. As one extreme, let us consider the case 
when a particle’s state, at any moment between to and t, may be accurately represented by one, relatively 
narrow wave packet. Then we may interpret Eqs. (49) as equations of essentially classical motion for the 
wave packet’s center, in accordance with the correspondence principle. However, even in this case it is 
important to remember about the purely quantum mechanical effects of nonvanishing wave packet width 
and its spreading in time, which were discussed in Sec. 2.2. 


In the opposite extreme, Eqs. (49), though valid, may tell almost nothing about system’s 
dynamics. Maybe the most apparent example is the “leaky” quantum well that was discussed in Sec. 2.5 
- see Fig. 2.18 and its discussion. Since both the potential U{x) and the initial state are symmetric 
relative to point x = 0, the right-hand parts of both Eqs. (49) identically equal zero. Of course, the result 
(that average values of both momentum and coordinate stay equal zero at all times) is correct, but it does 
not tell us too much about the rich dynamics of the system (the finite lifetime of the metastable state, the 
formation of two wave packets, their waveform and propagation speed), and about the important insight 
the solution gives for the quantum measurement theory. Another similar example is the band theory 
(Sec. 2.7), with its purely quantum effect of the allowed energy bands and forbidden gaps, of which Eq. 
(49) gives no clue. 


To summarize, the Ehrenfest theorem is important as an illustration of the correspondence 
principle, but its predictive power should not be exaggerated. 

Now we are ready to patch some holes left during our studies of wave mechanics in Chapters 1- 
3. First of all, I have promised you to develop a more balanced view at the monochromatic de Broglie 
waves (4.1), which would be more respectful to the evident r <-» p symmetry of the coordinate and 
momentum. Let us do this for the ID case when the wave may be presented as 17 


15 Indeed, acting exactly as at derivation of Eq. (36), for a space-local Heisenberg operator we get 

(A)(t) = J *¥* (x,t Q )i H (t, t 0 )¥(*, t 0 )dx . 

16 The set of equations (49) constitute the Ehrenfest theorem. 

17 From this point on, for the sake of brevity I will drop index x in the notation of the momentum - just as it was 
done in Chapter 2. 
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y/ p (x) = a p exp j i L for all - oo < x < +oo . 


(5.50) 


Let us have a good look at this function. Since it satisfies equation (34) for the ID momentum operator 
p = -ifid / dx , 


P¥ v = P¥ P > 


(5.51) 


y/ p is an eigenfunction of the momentum operator. But this means that we can also write Eq. (6) for the 
corresponding ket-vector: 

P\p) = p\p ), (5-52) 

and according to Eq. (19) the wavefunction may be presented as 

¥ P (x) = (x\p). (5.53) 

Expression (53) is quite remarkable in its x <-» p symmetry - which may be pursued further on. 
Before doing that, however, we have to discuss normalization of such functions. Indeed, in this case, the 
probability density w(x) (18) is constant, so that its integral 

+oo +oo 

| w(x)dx = J y/ p (x)y/ p ( x)dx (5.54) 

-oo -oo 

diverges if a p + 0. Earlier in the course, we discussed two ways to avoid this divergence. One is to use a 
very large but finite integration volume - see Eq. (1.31). Another way to avoid the divergence is to form 
a wave packet of the type (2.20), possibly of a very large length and very narrow spread of momenta p. 
Then integral (54) may be required to equal 1 without any conceptual problem. 

However, both these methods violate the x <-» p symmetry, and hence are inconvenient for our 
current purposes. Instead, let us continue to identify the bra- and ket-vectors (a A \ and \a A ) of the general 
theory, developed in the beginning of this section, with eigenvectors (p\ and | p) of momentum - just as 
we have already done in Eq. (52). Then the nonnalization condition (9) becomes 

{p\p') = S{p - p'). (5.55) 

Inserting the identity operator in the form (21) (with the integration variable x’ replaced by x) into the 
left-hand side of this equation, we can translate this nonnalization rule to the wavefunction language: 

J dx(p\x)(x\p') = J dxy/* p (x)y/ p ,(x) = S(p-p'). (5.56) 

Now using Eq. (50), this requirement turns into the following condition: 

a*a p , | expj/ — — = |a p | 2 7ihS(p - p') = S(p- p'), (5.57) 

—OO V. ^ J 

so that, finally, a p = cxp j /Vy [7(2^7/) l 2 , where (j) is an arbitrary (real) phase, and Eq. (50) becomes 18 


18 Repeating the calculation for each Cartesian component of a plane monochromatic wave of arbitrary 
dimensionality d, we get y/ p = (2 7iii) m ex p { /( p • r/& + (p ) } . 
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w P 00 


[infiT 


expj i 


r px 

V ft 



(5.58) 


As was mentioned above, for finite-length wave packets such normalization is not really 
necessary. However, frequently it makes sense to keep the pre-exponential coefficient in Eq. (58) even 
for wave packets, because of the following reason. Let us form a wave packet of the type (2.20), based 
on wavefunctions (58) - taking <j> = 0 for the notation brevity, because it may be incorporated into 
function (pip): 



(5.59) 


Wave 

packet 


From the mathematical point of view, this is just the equation of a ID Fourier spatial transform, and its in reciprocal 
reciprocal is 


represen- 

tations 



(5.60) 


These expressions are completely symmetrical, and present the same wave packet; this is why functions 
i/Ax) and (pip) are frequently called, respectively, the coordinate (x-) and momentum (p -) representations 
of the (same) state of the particle. Using Eqs. (53) and (58), they may be presented in an even more 
manifestly symmetric form, 


W(x) = J <p(p)(x | p)dp, (p{p) = \ y{x)(p | x)dx , 


(5.61) 


in which the scalar products satisfy the basic postulate (4.14) of the bra-ket formalism: 


p x = 


1 


(2 Tdlf 2 


cxp< - 1 


px 


= (x p 


(5.62) 


We already know that in the x-representation, i.e. in the usual wave mechanics, the coordinate 
operator x is reduced to the multiplication by x, and the momentum operator is proportional to a 
derivative over x: 



X 

in x = X> P 

in x 

-mP. 

dx 

(5.63) Momentum 

and coordinate 

It is natural to guess that in the /^-representation, the expressions for operators would be reciprocal: ^reciprocal 


X 

1 •* 9 
in P= lH ^’ P 
1 op 

in p Pi 

represen- 

(5.64) tations 


with the difference in one sign only, due to the opposite signs of the Fourier exponents in Eqs. (59) and 
(60). The proof of Eqs. (64) is straightforward; for example, acting by the momentum operator to 
wavefunction (59), we get 


d 1 

Py(x) = -ih—y(x) = —— 
ox (2 idi) 

1 


1/2 


(2.7th y 


\pcp(p) 


exp<z 


{ (p( piy- ih expjz dp 

fh 


(5.65) 
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and similarly for operator x acting on function (pip). Hence, the action of the operators (63) on 
wavefunction y/ (i.e. state’s x-representation) gives the same results as the action of operators (64) on 
function cp (i.e. its /^-representation). 

It is interesting to have one more, different look at this coordinate-to-momentum duality. For 
that, notice that according to Eqs. (4.82)-(4.84), we may consider the bra-ket (x\p) as an element of the 
(infinite-size) matrix U xp of the unitary transform from the x-basis to /xbasis. Now let us derive the 
operator transform rule that would be a continuous version of Eq. (4.92). Say, we want to calculate a 
matrix element of some operator in the /^-representation: 

{p\A\p'). (5.66) 


Inserting two identity operators (21) into this bra-ket, and then using Eq. (53) and its complex conjugate, 
and also Eq. (23) (again, valid only for space-local operators!), we get 

(p\A\p ,s j = J r/xj Jx'(/?|x)(x|.4|x'^x'|//) = J bxj dx'y p {x)(x^x'}y p ,(x') 

= — — [ dx f Jx'expj- i — jfifx - x')A expjz ^-1 = — — [ dx exp{- i — } A exp j / !— 

27th J J { h } [ h \ 2 7th J h h 



For example, for the momentum operator itself, this relation yields: 


p\p\p') = ^ xexp 


— i 


px 


r -ih— 


V 


dx 


exp< i 


. px\ 


27th 


Jexp 


(p'-p)x 

h 


■dx = p'S(p'-p). (5.68) 


Due to Eq. (52), this result is equivalent to the second of Eqs. (64). 

A natural question arises: why is the momentum representation used much less frequently than 
the coordinate representation - i.e., the wave mechanics? The answer is purely practical: besides the 
special case of the harmonic oscillator (to be revisited in Sec. 4 below), the orbital motion Hamiltonian 
(31) is not x p symmetric, with the potential energy U(x) being typically a more complex function 
than the kinetic energy, which is quadratic in momentum. Because of that, it is easier for problem 
solution to keep the potential energy operator just a wavefunction multiplier, as it is in the coordinate 
representation. 

The most significant exception of this rule is the motion in a periodic potential, especially in the 
presence of additional external force F(t), which may result in the effects discussed in Secs. 2.8 and 2.9 
(the Bloch oscillations, Landau-Zener tunneling etc.). Indeed, in this case the dispersion relation E n (q), 
typically rather involved, plays the role of the effective kinetic energy, while the effective potential 
energy U e f = -F(t)x in the field of the additional force is a simple function of x. This is why discussions 
of the listed and more complex issues of the band theory (such as quasiparticle scattering, mobility, 
diffusion, etc.) in solid state physics theory are most typically based on the momentum representation. 


5.3. Feynman’s path integrals 

As has been already mentioned, even within the realm of wave mechanics, the bra-ket language 
allows to streamline some calculations that would be very bulky using the notation used in Chapters 1-3. 
Probably the best example in the famous alternative, path integral fonnulation of quantum mechanics, 
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developed in 1948 by R. Feynman. 19 I will review this important concept - admittedly cutting one math 
comer for brevity. 20 (This shortcut will be clearly marked.) 

Let us inner-multiply both parts of Eq. (4.157), which is essentially the definition of the time- 
evolution operator, by the bra-vector of state x, 

(x| a(0) = (x\u(t,t 0 )\a(t 0 )), (5.69) 

insert the identity operator before the ket-vector in the right-hand part, and then use the closure 
condition in the form of Eq. (21), with x ’ replaced with xo: 

(x| a(t)) = ^dx Q (x\ii (f,f 0 )|x 0 )(x 0 |a(f 0 )). (5.70) 

According to Eq. (19), this equality may be presented as 

*¥ a (a t) = | dx 0 (x I u ( t , t 0 )| x 0 ) v F a (x 0 , t 0 ) . (5.71) 

Comparing this expression with Eq. (2.44), we see that the bra-ket in this relation is nothing else than 
the ID propagator, which was discussed in Sec. 2.2: 

(x\u{t,t 0 )\x 0 ) = G{x,t;x 0 ,t 0 ) . (5.72) 

As a reminder, we have already calculated the propagator for a free particle - see Eq. (2.49). 

Now let us break the time segment [/ 0 , /] into N (for the time being, not necessarily equal) parts 
by inserting ( N - 1) intermediate points (Fig. 2) 

t 0 <t x <...<t k <...<t N _ x <t, (5.73) 

and rewrite the time evolution operator in the form 

u (t,t 0 ) = u (l t , t N _ x )u (t N _ j , t N _ 2 ).. .u (t 2 , t x )u {t x , t 0 ) , (5 .74) 

whose correctness is evident from the very definition (4.157) of the operator. Plugging Eq. (74) into Eq. 
(72), let us insert the identity operator, again in the form (21) but written for Xk rather than x’, between 
each two partial evolution operators including time argument C. The result is 

G(x, t; x 0 t 0 ) = J dx N , J dx N _ 2 ...J dx x (x | u (l t , t N _ x )|x JV _ 1 )(x N _ x \ u (t N _ x ,t N _ 2 )| x N _ 2 )...(x x \ u (t x , t 0 )| x 0 ). (5.75) 


X 0 ' 

' X 1 ' 

S x A 

\ x / 

^N- 2 

\ x / 

A N-\ 

' X ' 

\ 







^ 


h 


h 


t 


N - 2 


t N -l t 


Fig. 5.2. Time partition and coordinate 
notation at the initial stage of the 
Feynman’s path integral derivation. 


19 According to Feynman’s memories, his work was motivated by a “mysterious” remark by P. A. M. Dirac in his 
pioneering 1930 textbook on quantum mechanics. 

20 For a more thorough discussion of the path-integral approach, see the famous text R. Feynman and A. FTibbs, 
Quantum Mechanics and Path Integrals first published in 1965. (For its latest edition by Dover in 2010, the book 
was emended by D. Styler.) For a more recent monograph that reviews more applications, see L. Schulman, 
Techniques and Applications of Path Integration, Wiley, 1981. 
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The physical sense of each integration variable Xk is the wavefunction’s argument at time tk - see 
Fig. 2. The key Feynman’s breakthrough was the realization that if all intervals are similar and 
sufficiently small, tk - tk- 1 = dz — > 0, all the partial bra-kets participating in Eq. (75) may be readily 
expressed via Eq. (2.49), even if the particle is not free, but moves in a stationary potential profile U(x). 
To show that, let us use either Eq. (4.175) or Eq. (4.181), which, for a small time interval dz, give the 
same result: 


u(z + dz,z) = expt Hdz > = expt 


h 


h 


'' z, 

dz + U(x)dz 
2 m 


(5.76) 


Generally, an exponent of a sum of two operators may be treated as that of onumber arguments, 
and in particular factored into a product of two exponents, only if the operators commute. (Indeed, in 
this case we can use all the standard algebra for exponents of c-number arguments.) In our case, this is 
not so, because operator p does not commute with x, and hence with U(x ). However, it may be 
shown 21 that for an infinitesimal time interval dz, the nonvanishing commutator 


l—dz,U(x)dz 
2 m 


* 0, 


(5.77) 


2 

proportional to ( d z) , is so small that in the first approximation in dz its effects may be ignored. As a 
result, we may factor the right-hand part in Eq. (76) by writing 

u(z + dz,z) dT ^ 0 — » expt- — Jz4exp<- — U(x)dz\. (5.78) 

[ fi 2 m J { h J 


(This approximation is very much similar in spirit to the rectangle-formula approximation for a usual ID 
integral, which in also asymptotically impeachable.) 

Since the second exponential function in the right-hand part of Eq. (78) commutes with the 
coordinate operator, we can move it out of each partial bra-ket participating in Eq. (75), with U(x) 
turning into a c-number function: 

{ X T + dr \u(r + dr,z)\x T ) = ( x T+dT | expj- J r|| x r ) exp j- l - U (x)d z\. (5.79) 


But the remaining bra-ket is just the propagator of a free particle, and we can use Eq. (2.49) for it: 




r V /2 

m 


Initid z 


exp < i - 


. m(dx ) 2 


As the result, the full propagator (75) takes the form 

f 

G(x,t;x 0 t 0 ) = lhn , r ^ 0 J dx N ^dx N _ 2 .] dx 


,N/2 


111 


Ijzihdz 


ex PjZ 


k = i 


Ifidz 


. m(dx) 2 .U(x) 

l l U T 

2hdz h 


(5.80) 


.(5.81) 


21 A strict proof of this intuitively evident statement would take more space and time than I can afford. 
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At A — » oo and hence dr = (t - to)/N — » 0, the sum under the exponent in this expression tends to an 
integral: 


I 


m 

( dx^ 

2 

. t 

l r 

m 

( dx'' 

2 



-U(x) 

T -t «r->- 



-U(x) 

2 

y dr 9 


T k h ? 

‘0 

2 

yd r ) 



dr, 


(5.82) 


and the expression in square brackets is just the particle’s Lagrangian function L. 22 The integral of the 
function over time is the classical action 5" calculated along a particular “path” x( r). 23 As a result, 
defining the (ID ) path integral as 


. 

(,..)D[x(r)] = lim^ 0 

N — >oo 

m N 
ylmhdv y 

N/2 

. 

dx N _ K 

dx N _ 2 -_ 

dx i (...), 


(5.83a) 


1 D path 
integral: 
definition 


we can bring our result to a superficially simple form 



(5.83b) 


ID 

propagator 
via path 
integral 


The name “path integral” for the mathematical construct (83a) may be readily explained if we 
keep the number A of time intervals large but finite, and also approximate each of the enclosed integrals 
by a sum over M» 1 discrete points along the coordinate axis (Fig. 3a). 



Fig. 5.3. Several ID classical 
paths in (a) the discrete 
approximation and (b) the 
continuous limit. 


Then the path integral is a product of (A - 1) sums corresponding to different values of time r, 
each of them with M terms, each of the terms representing the function under the integral at a particular 
spatial point. Multiplying those (A - 1) sums, we get a sum of (A - 1 )M terms, each evaluating the 
function at a specific spatial-temporal point [jc, r\. These terms may be now grouped to represent all 
possible different continuous classical paths jc[r] from the initial point [xoTo] to the finite point \x,t\. It is 
evident that the last interpretation remains true even in the continuous limit A, M — » co - see Fig. 3b. 

Why does such representation of the sum has sense? This is because in the classical limit the 
particle follows just a certain path, corresponding to the minimum of action S. Hence, for all close 
trajectories, the difference (S - SO is proportional to the square of the deviation from the classical 
trajectory. Hence, for a quasiclassical motion, with S\ » % there is a substantial bunch of close 
trajectories, with (S - SO « K that give similar contributions to the path integral. On the other hand, 


22 See, e.g., CM Sec. 2.1. 

23 See, e.g., CM Sec. 9.2. 
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strongly non-classical trajectories, with (S - Si) » h, give phases Sh rapidly oscillating from one 
trajectory to the next one, and their contributions to the path integral are averaged out. 24 As a result, for 
the quasiclassical motion, the propagator’s exponent may be evaluated on the classical path: 


G cl cc exp 


i 

h 




~ U(x ) 


dz k 


(5.84) 


The sum of the kinetic and potential energies is the full energy E of the particle, that remains constant 
for motion in a stationary potential U(x), so that we may rewrite the expression under the integral as 25 


m | 

to 

1 

dz = 


1 

<N 

% 

V 

j -U(x) 

m 

— ~ E 
{dz J 


dz = m — dx - Ed z. 
dz 


(5.85) 


With that replacement, Eq. (83b) yields 


G cl oc exp^j^m^-dx^exp^-jE(t-t 0 ) j- = exp<j^- J p(x)dx^ - j E(t - 1 0 ) J>, (5.86) 


dx 


dz 


\ i 


h 


i 

n 


where p is the classical momentum of the particle. But (at least, leaving the pre-exponential factor alone) 
this is exactly the WKB approximation result that was derived and studied in detail in Chapter 2! 

One may question the value of a calculation that yields the results that could be readily obtained 
from Schrodinger’s wave mechanics. The Feynman’s approach, is indeed not used too often, but it has 
its merits. First, it has an important philosophical (and hence heuristic) value. Indeed, Eq. (83) may be 
interpreted by saying that the essence of quantum mechanics is the exploration, by the system, of all 
possible paths x(z), each of them classical-like in the sense that the particle’s coordinate x and velocity 
dxldz (and hence its momentum) are exactly defined simultaneously at each point. The resulting 
contributions to the path integral are added up coherently to form the final propagator G, and via it, the 
final probability W oc |G|~ of the particle propagation from [xo,^o] to [a,/]. Of course, as the scale of action 
(i.e. of the energy-by-time product) of the motion decreases and becomes comparable to h, more and 
more paths produce substantial contribution to this sum, and hence to W, ensuring a larger and larger 
difference between the quantum and classical properties of the system. 

Second, the path integral provides a justification for some simple explanations of quantum 
phenomena. A typical example is the quantum interference effects discussed in Sec. 3.1 - see, e.g., Fig. 
3.1 and the corresponding text. At that discussion, we used the Huygens principle to argue that at the 
two-slit interference, the WKB approximation might be restricted of effects by two paths that pass 
through different slits, but otherwise consisting of straight-line segments. To have another look at that 
assumption, let us generalize the path integral to multi-dimensional geometries. Fortunately, the simple 
structure of Eq. (83b) makes such generalization virtually evident: 


24 This fact may be proved by expanding the difference (S- Si) in the Taylor series in path variations (leaving 
only the leading quadratic terms) and working out the resulting Gaussian integrals. It is interesting that the 
integration, together with the pre-exponential coefficient in Eq. (83a), gives exactly the pre-exponential 
factor that we have already found in Sec. 2.4 when refining the WKB approximation. 

25 The same trick is often used in analytical classical mechanics - say, for proving the Hamilton principle, and for 
the derivation of the Hamilton - Jacobi equations (see, e.g. CM Secs. 10.3-4). 
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3D 

(5.87) propagator 
v ' via the path 
integral 


where definition (83a) of the path integral should be also modified correspondingly. (I will not go into 
these technical details.) For the Young-type experiment (Fig. 3.1), where a classical particle could reach 
the detector only after passing through one of the slits, the classical paths are the straight-line segments 
shown in Fig. 3.1, and if they are much longer than the de Broglie wavelength, the propagator may be 
well approximated by the sum of two integrals oiLdv= z'p(r)-r/r/ fi - as it was done in Sec. 3.1. 

Last but not least, the path integral allows simple solutions of some problems that would be hard 
to get by other methods. As the simplest example, let us consider the problem of tunneling in multi- 
dimensional space, sketched in Fig. 4 for the 2D case - just for graphics’ simplicity. Here, potential U(x, 
y) has the “saddle” shape. (Another helpful image is a mountain path between two summits, in Fig. 4 
located on the top and at the bottom of the drawing.) A particle of energy E may move classically in the 
left and right regions with U(x, y) < E, but can pass from one of these regions to another one only via the 
quantum-mechanical tunneling under the pass. Let us calculate the transparency of this tunnel barrier in 
the WKB approximation, ignoring the possible pre-exponential factor. 




Fig. 5.4. Saddle-type 2D potential 
profile and the instanton trajectory of 
a particle of energy E (dashed line, 
schematically). 


According to the evident multi-dimensional generalization Eq. (86), for the classically forbidden 
region, where E < U{x, y), the contributions to propagator (87) are proportional to 


exp j - j K(r) • dr l exp j - j E(t - 1 0 ) 


(5.88) 


where the magnitude of vector k at each point may be calculated just in the ID case - see, e.g., Eq. 
(2.97), 

^f^l = U(r)-E, (5.89) 

2 m 

while its direction should be tangential to the path trajectory in space. Now the path integral is actually 
much simpler than in the classically-allowed region, because the spatial exponents are purely real and 
there is no complex interference between them. Because of the minus sign in the exponent, the largest 
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contribution to G evidently comes from the trajectory (or rather a narrow bundle of trajectories) for 
which the functional 


jK(r') - dr' (5.90) 

r„ 

has the smallest value, and the barrier transmission coefficient may be calculated as 


(5.91) 


where r and ro are certain points on the opposite classical turning-point surfaces: U( r) = U{ ro) = E. 26 

Thus the tunneling problem is reduced to finding the trajectory (including points r and ro) that 
connects the two surfaces and minimizes functional (90). This is of course a well-known problem of the 
calculus of variations, 27 but it is interesting that the path integral provides a simple alternative way of 
solving it. Let us consider an auxiliary problem of particle’s motion in a potential profile t/i nv (r) that is 
inverted relative to particle’s energy E, i.e. is defined by the following equality: 

U mv (r)-E = E-U(r). (5.92) 

As was discussed above, at fixed energy E, the path integral for the WKB motion in the classically 
allowed region of potential U mv (x,y) (that coincides with the classically forbidden region of the original 
problem) is dominated by the classical trajectory corresponding to the minimum of 

Sin. = jp„.(r')-*' = 4 k ,» (r ' ) ' dr ’ (193) 

r 0 r 0 

where k inv should be determined from the relation 

Pi 2 k 2 (r 1 

‘ m ) = E-U mv (r). (5.94) 

2m 

But comparing Eqs. (89), (92), and (94), we see that k; nv = k at each point of space! This means that the 
tunneling path (in the WKB limit) corresponds to the classical (so-called instanton ) 28 trajectory of the 
same particle in the inverted potential L' mv (r). If the initial point ro is fixed, this trajectory may be 
readily found by the means of classical mechanics. (Note that the initial velocity of the instanton 
launched from point r 0 should be zero, because by the classical turning point definition: f/ mv (ro) = L'(ro) 
= E.) Thus the problem is reduced to a simpler task of maximizing the transparency (91) over the 
position of ro on the equipotential surface t/(ro) = E. Moreover, for many symmetric potentials, the 
position of this point may be readily guessed without calculations. 


3D 

f r 1 

tunneling 
in WKB 

T « |G|“ « exp< - 2 J K(r') -dr'\ , 

limit 

{ 4 \ 


26 One can argue that the pre-exponential coefficient in Eq. (91) should be close to 1, just like in Eq. (2.117), 
especially if the potential is smooth in the sense of Eq. (2.107), where x is the coordinate along the trajectory. 

27 For a concise introduction to the field see, e.g., I. Gelfand and S. Fomin, Calculus of Variations, Dover, 2000, 
or L. Elsgolc, Calculus of Variations, Dover, 2007. 

28 In quantum field theory, the instanton concept may be formulated somewhat differently, and has more complex 
applications - see, e.g. R. Rajaraman, Solitons and Instantons, North Holland, 1987. 
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Note that besides the calculation of barrier transparency, the instanton trajectory has one more 
important implication: the so-called traversal time r t of the classical motion along it, in the inverted 
potential, defined by Eq. (94), plays the role of the most important (though not the only one) time scale 
of particle’s tunneling under the potential barrier. 29 


5.4. Revisiting harmonic oscillator 

Let us return to the ID harmonic oscillator, i.e. any system described by Hamiltonian (2.50) with 
potential energy (2.111): 

Harmonic 
(5.95) oscillator: 
Hamiltonian 

In Sec. 2.10 we have used the “brute-force” (wave-mechanics) approach to analyze the eigenfunctions 
y/ n (x) and eigenvalues E„ of this Hamiltonian, and found that, unfortunately, that approach required 
relatively complex math that obscures the physics of these stationary (“Fock”) states. Now let us use the 
bra-ket formalism to make the properties of these states much more transparent, using very simple 
calculations. 

First, introducing normalized (dimensionless) operators of coordinates and momentum: 30 

£ = —, £ = (5-96) 

x 0 ma> 0 x 0 

where xo = (h/mox,) 12 is the natural coordinate scale (a/2 the r.m.s. spread of ground-state wavefunction) 
which was discussed in detail in Sec. 2.10, we can present Hamiltonian (95) in a very simple andx <-> p 
symmetric form: 

H = ^(i 2 +C 2 ). (5.97) 



Now, let us introduce a new operator 


a 




ma > 0 
2 Ti 


, 1/2 


x + z- 


m co, 


o J 


(5.98a) 


Creation- 

Since both operators £ and Q correspond to real observables, i.e. have real eigenvalues and hence are operators- 11 
Hermitian (self-adjoint), the Hermitian conjugate of operator a is simply its complex conjugate: 


(5.98b) 



Solving the system of two equations (98) for and , we may readily get reciprocal relations 


29 See, e.g., M. Buttiker and R. Landauer, P/iys. Rev. Lett. 49 , 1739 (1982), and references therein. 

30 This normalization is not really necessary, it just makes the following calculations less bulky - and thus more 
aesthetically appealing. 
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* = V2r +a ) 


a- a' 


(5.99) 


Our Hamiltonian (97) includes squares of these operators. Calculating them, we have to be careful to 
avoid swapping the new operators, because they do not commute. Indeed, for the normalized operators 
(96), Eq. (2.14) gives 




1 


x 0 mco 0 


[x,p] = il, 


so that Eqs. (98) yield 



With such due caution, Eq. (99) gives 

f . 2 


= - 
2 


a + a' + aa' + a ' a 


V 


1 

c =— 
2 


r 


(5.100) 


(5.101) 


(5.102) 


(5.103) 


This expression is elegant enough, but may be recast into an even more convenient form. For 
that, let us rewrite the commutation relation (100) as 


«2 /'t »t » 

a + a' - aa' - a' a 


V 


Plugging these expressions back into Eq. (97), we get 


fico 0 1 ' t 

H = — - aa ' + a ' a . 


4 - 4 - a. 

A A I A I A T 

aa' = a a + 1 


and plug it into Eq. (103). The result is 


H = 


fico n 




2d ' a + / = fico . 


( - I - 
N + -I 

l 2 


where, in the last fonn, one more (evidently, Hermitian) operator, 


N = a^ a , 


(5.104) 


(5.105) 


(5.106) 


has been introduced. Since, according to Eq. (105), operators H and N differ only by the addition of 
an identity operator and the multiplication by a c-number, these operators commute. Hence, according to 
the general arguments of Sec. 4.5, they share the set of stationary (Fock) eigenstates n, and we can write 
the eigenproblem for the new operator as 


iV|«) = N n \n 


(5.107) 


where N n are some eigenvalues that, according to Eq. (105), determine also the energy spectrum of the 
oscillator: 


f 


E„ = h co „ 


iV„+- 
V 2 


(5.108) 
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So far, we know only that all eigenvalues N n are real, but not much more. In order to calculate 
them, let us carry out the following calculation - splendid in its simplicity and efficiency. Consider the 
result of action of operator N on the ket-vector a ' \n). Using the definition (106) and the associative 
rule, we may write 




it|„\ 1 a^\n) 1 = a q aa*' \\n) . 


Now using the commutation relation (104), and then Eq. (107), we may continue as 

a + = d"^(iV + /) |n) = (N n + l)|n) = (N n + l)j^<7^| 

Let us summarize the result of this calculation: 


(5.109) 


(5.110) 



(5.111) 


Performing an absolutely similar calculation with operator a , we can also get another formula: 

7V(d|«))=(7V„ - l)(d|»}). 


(5.112) 


It is time to stop calculations and translate these results into plain English: if | n) is an eigenket of 
operator N with eigenvalue N n , then a ’ | n) and a \n) are also eigenkets of that operator, with 
eigenvalues (N n + 1), and (N„ - 1), respectively. This statement may be presented with the ladder 
diagram shown in Fig. 5. 


eigenket ... eigenvalue of N 



Fig. 5.5. Hierarchy (the “ladder diagram”) of eigenstates 
of a ID harmonic oscillator. Arrows show the actions of 
the creation and annihilation operators on the 
eigenstates. 


Operator a ' moves the system a step up the ladder, while operator a brings it one step down. In 
other words, the former operator creates a new excitation of the system, 31 while the latter operator kills 
(“annihilates”) such excitation. This is why a ' is called the creation operator, and a , the annihilation 
operator. In its turn, according to Eq. (107), operator N does not change the state of the system, but 
“counts” its position on the ladder: 

(n |7V| n) = ( n \ N n | n) = N n . (5.113) 


31 For the electromagnetic field oscillators, such excitations are called photons’, for the mechanical wave field 
oscillators, phonons, etc. 
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This is why N is called the number operator, in our current context meaning the number of the 
elementary excitations of the oscillator. 

This calculation still needs a completion. Indeed, we still do not know whether the ladder shown 
in Fig. 5 shows all eigenstates of the oscillator, and what exactly the numbers N„ are. Fascinating 
enough, both questions may be answered by exploring a single paradox. Let us start with some state 
(step of the ladder), and keep going down it, applying operator a again and again. Each time, 
eigenvalue N„ is decreased by one, so that eventually it should become negative. However, this cannot 
happen, because any real eigenstate, including the states presented by kets | d) = a \n) and | n), should 
have a positive norm - see Eq. (4.16). Comparing the nonns, 

|h|~ = (n\n), ||c/||” = (n\a^ a\n^ = (n\N\n) = N n {n\n), (5.114) 

we see that the both of them cannot be positive simultaneously if N n is negative. 

The way toward the resolution of this paradox is to notice that the action of the creation and 
annihilation operators on the stationary states may consist in not only their promotion to the next step of 
the ladder diagram, but also by their multiplication by some c-numbers: 

a\n) = A n \n -l), = A' n \n + l). (5.115) 


(Linear relations (111) and (112) clearly allow that.) Let us calculate coefficients A n assuming, for 
convenience, that all eigenstates, including states n and (n -1), are normalized: 


/'■j- 

n\n) = 1, In -l| n - 1) = (n\^— — \n) = 


a: A n 


i 


* 

A n A n 


1 1 A| nj = 


N„ 


* 

A n A n 


n\n) = 1 . 


1/9 

From here, we get \A„ | = (N„) ", i.e. 


a\n) = Nl n e i<P »\n-l), 


(5.116) 


(5.117) 


where <p n is an arbitrary real phase. Now let us consider what happens if all numbers N n are integers. 
(Because of the definition of N„, given by Eq. (107), it is convenient to call these integers n, i.e. by the 
same letter as the corresponding eigenstate.) Then when we have come down to state with n = 0, an 
attempt to make one more step down gives 

a|0) = 0|-l) . (5.118) 

But in accordance with Eq. (4.9), the state in the right-hand part of this equation is the “null-state”, i.e. 
does not exist. 32 This gives the (only known :-) resolution of the state ladder paradox: the ladder has the 
lowest step with N„ = n = 0. 

As a by-product of our discussion, we have obtained a very important relation N„ = n, which 
means, in particular, that the state ladder includes all eigenstates of the oscillator. Plugging this relation 
into Eq. (108), we see that the full spectrum of eigenenergies of the hannonic oscillator is described by 
the simple formula 


32 Please note again the radical difference between the null-state in the right-hand part of Eq. (118) and the state 
described by ket- vector |0) in the left-hand side of that relation. The latter state does exist and, moreover, presents 
the most important, ground state of the system, with n = 0 - see Eq. (2.269). 
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E„ = fl CO r 


n + — 


n = 0,1, 2... , 


(5.119) 


which was already discussed in Sec. 2.10. It is rather remarkable that the bra-ket formalism has allowed 
us to derive it without calculation of the corresponding (rather cumbersome) wavefunctions y/ n (x) - see 
Eqs. (2.279). 

Moreover, the formalism may be also used to calculate virtually any bra-ket pertaining to the 
oscillator, without using y/ n (x). In order to illustrate that, let us first calculate A participating in the 
latter of relations (115). This can be done absolutely similarly to the above calculation of A n ; otherwise, 
since we already know that \A„\ = (A),) 12 = n 1 ' 2 , we may notice that according to Eqs. (106) and (115), 
the eigenproblem (107), that in our new notation for N„ becomes 

N\n) = n\n), (5.120) 


may be rewritten as 


n\n 


= or a\n ) 


) = a^ A n \n-\) = A n A n _ x \n). 


(5.121) 


1/9 

Comparing the first and the last form of this equality, we see that | A ’„_i| = nl\A„\ = n , i.e. A = (n + 

1/2 

1) cxp(/7/y ’)• Taking all phases <p n and <p„’ equal to zero for simplicity, we may reduce Eqs. (115) to 
their final, standard form 33 


fit 

n'j = (n +1) 17 " 

n + l), a 

n) = n 112 

n - 1) . 


(5.122) 


Now we can use these fonnulas to calculate, for example, the matrix elements of operator x in 
the Fock state basis: 


Up and 
down the 
Fock state 
ladder 


To complete the calculation, we may now use Eqs. (122) and the Fock state orthonormality: 

(n'\n) = S„,„. 


The result is 




n V" 


V 2 mo) oJ 


+(n + \)' n 6 n> A 


Acting absolutely similarly, for the momentum bra-kets we get a similar expression: 


i’\p\n} = 


tun 6),, 


(-« 1/2 ^,»-t +(« + 1 ) 1/2 ^> + t) 


(5.123) 


(5.124) 


Coordinate’s 

(5.125) matrix 

elements 


(5.126) 


Flence the matrices of both operators in the Fock-state basis have only two diagonals, adjacent to the 
main diagonal; all other elements (including the diagonal ones) are zeros. 


33 A useful mnemonic rule is that the c-number coefficient in any of these relations is equal to the square root of 
the largest number of the two states it relates. 
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Matrix elements of higher powers of these operators, as well as their products, may be handled 
similarly, though the higher is the power, the bulkier is the result. For example, 34 


n x \n) — (n \xx\n 


x n 


= ^(«'|x| «")(«" |. 

n "= 0 

n "= 0 

t n (n - 1)] 1 ' 2 S n> _ 2 + [(n + 1 ){n + 2)]" 2 8 n , n+2 + (2 n + 1 )S n> }. 


^0 

^ n "= 0 

2 
x 0 


(5.127) 


For applications, the most important of these matrix elements are those on its main diagonal: 


'. 2 |„\ _ X 0 


x J = (n\x \nj = — (2n + 1). 


(5.128) 


This expression shows, in particular, that the expectation value of oscillator’s potential energy in n-th 
Fock state is 


U) = 


mcOa 


o / 2 


ft (Or, 


X = 


1 

n H — 


(5.129) 


This is exactly 14 of the total energy (119) of the oscillator. As a sanity check, an absolutely similar 
calculation of the kinetic energy shows that 


1 


L) = ^ n ^ n> = -2 


h co n ( 1 ^ 

1 n + - 


V 


(5.130) 


■J 


i.e. both partial energies equal EJ2, just as in a classical oscillator. 


35 


5.5. The Glauber and squeezed states 

There is evidently a huge difference between a quantum stationary (Fock) state of the oscillator 
and its classical state. Indeed, let us write the classical Hamilton equations of motion of the oscillator 
(using capital letters to distinguish the classical variables from arguments of quantum wavefunctions): 

P ■ fill 

X = —, p = -— = -mco 2 X. (5.131) 

m Gx 

On the “phase plane” with Cartesian coordinates x and p (Fig. 6), these equations describe clockwise 
rotation of the representation point {X(t), P(t)} along an elliptic trajectory starting from the initial point 
|X(0), /TO)}. (The nonnalization of momentum by mop, similar to the one performed by the second of 
Eqs. (96), makes the trajectory pleasingly circular, with a constant radius equal to oscillation’s 
amplitude A, reflecting the constant full energy 


34 The first line of Eq. (127), evidently valid for any time-independent system, is the simplest of the so-called sum 
rules, which will be repeatedly discussed below. 

35 Still note that operators of the partial (potential and kinetic) energies do not commute with either each other or 
with the full-energy (Hamiltonian) operator, so that the Fock states n are not their eigenstates. 
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E =■ 


m (o,: 


A 2 , A 2 = [X(t)f + 


Pit) 


main 


= const = [X(0)] 2 + 


P(0) 


mco n 


(5.132) 


determined by the initial conditions.) 

For the forthcoming comparison with quantum states, it is convenient to describe this classical 
solution by the following dimensionless complex variable 


a(t ) 


I 


y[2.X 


X(t) + i 


: PiO 


mco. 


(5.133) 


o J 


which is essentially the standard complex-number representation of system’s position on the 2D phase 
plane, with \a\ = AN 2xq. With this definition, Eqs. (131) are conveniently merged into one equation, 


a = -ico 0 a, 


(5.134) 


with an evident, very simple solution 


a(t) = a( 0)e u ° ot . 


(5.135) 


where the constant a(0) may be complex, and is just the (normalized) classical complex amplitude of 
oscillations. 36 This equation describes sinusoidal oscillations of both X(t) oc Re[rz(f)] and P oc Im[ cdj)\, 
with a phase shift of id 2 between them. 



Fig. 5.6. Schematic representation of various states of a 
harmonic oscillator on the phase plane. The bold black 
point represents a classical state, with the dashed line 
showing its trajectory. (Very imperfect) classical images 
of the Fock states with n = 0, 1 , and 2 are shown in blue, 
while the blurred red spot is the (equally schematic) 
Glauber state’s image. Finally, the magenta elliptical 
spot is a classical image of a squeezed ground state. 
Arrows show the direction of states’ evolution in time. 


On the other hand, according to the basic Eqs. (4. 157)-(4. 158), the time dependence of a Fock 
state, as of a stationary state of the oscillator, is limited to the phase factor exp {-iE n t/h} not in 
observables, but rather in the wavefunction, and a result, gives time-independent expectation values of x, 
p, or of any function thereof. (Moreover, as Eqs. (125) and (126) show, (x) = (p) = 0.) Taking into 


36 See, e.g., CM Chapter 4, especially Eqs. (4.4) and Fig. 4.9 and its discussion. 
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account Eqs. (129) and (130), the closest (though very imperfect) geometric image 37 for such a state on 
the phase plane is a blurred circle of radius A n = xo(2 n + 1) ", along which the wavefunction is 
uniformly spread as a wave - see the blue rings in Fig. 6. For the ground state (n = 0), with 
wavefunction (2.269), a better image is a blurred round spot, of radius ~x (h at the origin. 

However, the Fock states n are not the only possible quantum states of the oscillator: according 
to the basic Eq. (4.6), a state described by ket-vector 

oo 

\a) = ^a n \n) (5.136) 

n = 0 

with any set of (complex) c-numbers a n , is also its legitimate state, subject only to the normalization 
condition {a\a) = 1, giving 

oo 

ZW 2=1 - (5.137) 

n = 0 


It is natural to ask: can we select coefficients a„ in such a special way that the state properties 
would be closer to the classical ones; in particular the expectation values (x) and ip) of coordinate and 
momentum would evolve in time just as the classical values X(t) and P(t), while the uncertainties of 
these observables would be time-independent and the same as in the ground state: 


dx = 



( i Y /2 
n 

y 2 mco 0 


dp = 


ma> 0 x 0 


Tim <x) n 


. 1/2 


(5.138) 


with the smallest possible value of the uncertainty product, dxdp = W2? % Fet me show that such a 
Glauber state , 39 which is schematically represented in Fig. 6 by a blurred red spot around the classical 
point {X{t), P(t)}, is indeed possible. 

Conceptually the simplest way to find the corresponding coefficients a n would be to calculate 
(x), ip), Sx and dp for an arbitrary set of a n , and then try to optimize these coefficients to reach our goal. 
However, this problem may be solved much easier using wave mechanics. Indeed, let us consider the 
following wavefunction 


Its comparison with Eqs. (2.16) and (2.269) shows that this is just a Gaussian wave packet with the 
average momentum P and the coordinate width dx given by Eq. (138), but shifted along axis x by X. 


v F ff (x,O = C 0 exp< - 


111 co, 


[ x -X(»Y + M. 

2 n n 


(5.139) 


Glauber 
state in 
coordinate 
representation 


37 I have to confess that such geometric mapping of a quantum state on the phase plane [x, p\ is not exactly 
defined; you may think about colored areas in Fig. 6 as regions of pairs {x, p) most probably obtained in 
measurements. A quantitative definition of such a mapping will be given in Sec. 7.3 using the Wigner function, 
though, as we will see, even such imaging definition has certain internal contradictions. Still such cartoons may 
have considerable cognitive/heuristic value, if their limitations are kept in mind. 

38 In the quantum theory of measurements, Eqs. (138) are frequently referred to as the standard quantum limit. 

39 Named after R. J. Glauber who studied these states in detail in 1965, though they had been discussed in brief by 
E. Schrodinger as early as in 1926. Another popular name, “coherent”, for the Glauber states is very misleading, 
because all the quantum states we have studied so far (including the Fock states) may be presented as coherent 
(pure) superpositions of the basis states. 


Chapter 5 


Page 27 of 50 


Essential Graduate Physics 


QM: Quantum Mechanics 


Hence, this wavefunction satisfies all the above requirements, and a straightforward (though a bit bulky) 
differentiation over x and t shows it also satisfies oscillator’s Schrodinger equation, provided that that 
functions X(t) and P(t) satisfy classical Eqs. (131). 

This fact is true even for a more general situation when the oscillator, initially in its ground 
state 40 comes under effect of a classical force F(t). (Evidently, for its description its is sufficient to add 
this function to the right-hand part of the second of Eqs. (131).) Moreover, the electromagnetic radiation 
formed in “good” (single-mode) lasers is also in the Glauber state. (As will be discussed in Chapter 9, 
the experimental formation of Fock states n, with the only exception of n = 0, i.e. the ground state, is 
much harder.) This is why the Glauber states are so important. 

Though Eq. (139) gives the full wave-mechanics description of a Glauber state, there is a 
substantial place for the bra-ket formalism here too. For example, in order to calculate the corresponding 
coefficients in expansion (136), 

a n = (n\a) = J<&(«|x)(x|a) = jV*(x) y/ a (x)dx, (5.140) 


we would need to use not only the simple Eq. (139), but also the Fock state wavefunctions y/ n (x), which 
are not very appealing - see Eq. (2.279). Instead, this calculation may be readily done in the bra-ket 
formalism, giving us one important byproduct result. 

Let us start from expressing the double shift of the ground state (by X and P), that has led us to 
Eq. (139), in the operator language. Forgetting about the P for a minute, let us find a translation 

operator T x that produces the desirable shift of coordinate by X of an arbitrary wavefunction yAx) - say 
represented as the standard wave packet (59). Evidently, the result of its action, in the coordinate 
representation, is 

— | dp. (5.141) 


T x y/(x) = y/{x -X) = 


Z)= (77r^ (p)exp - 


;P(X- 


Hence, the shift may be achieved by the multiplication of each Fourier component of the packet, with 
momentum p, by exp { -ipX/ti } . This gives us a hint that the general fonn of the translation operator, valid 
in any representation, should be 



(5.142) 


X-translation 

operator 


The proof of this formula is provided by the fact that any operator is uniquely detennined by the set of 
its matrix elements in any full and orthogonal basis, in particular the basis of momentum states p. 
According to Eq. (141), the analog of Eq. (23) for the /^-representation, applied to the translation 
operator (which is evidently local), is 


P\^X\P') ( P(P') = S(p~p')QX pj- 


px 


(p{p) 


(5.143) 


so that operator (142) does exactly the job we need it to. 


40 As will be discussed in Chapter 7, the ground state may be readily formed, for example, by providing a weak 
coupling of the oscillator to a low-temperature (k B T « tiax) environment. 
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a-translation 

operator 


Glauber 
state as 
ground 
state’s 
translation 


The operator that provides the shift of momentum by P is absolutely similar - with the opposite 
sign under the exponent, due to the opposite sign of the exponent in the reciprocal Fourier transform, so 
that the simultaneous shift by both X and P may be achieved by the following translation operator: 



(5.144) 


As we already know, for a harmonic oscillator the creation-annihilation operators are more natural, so 
that we may use Eqs. (96) and (99) to recast Eq. (144) as 


T a = exp] aa ^ - a a k with Tj = exp ] a a -aa*' 


(5.145) 


where the c-number a (generally, a function of time) is defined by Eq. (133). Now, according to Eq. 
(139), we may form the Glauber state’s ket- vector just as 


<*) = t | 0 ). 


(5.146) 


This formula looks nice and simple, but making practical calculations (say, calculating 
expectation values of variables) with the translation operator (144) is not too easy because of its 
exponent-of-operators form. Fortunately, it turns out that a much simpler representation for the Glauber 
state is possible. To show than, let us start with the following general (and very useful) property of 
exponential functions of operators: if 



pi. 


(5.147) 


(where A and B are arbitrary operators, and // is a c-number), then 41 

expj+ ^}f?expj- a}= B + jui. 

Let us apply Eqs. (147)-(148) to two cases, both with 

A = a a-aa\ so that exp{+ a}= T a \ exp|- A^=T a . 


(5.148) 

(5.149) 


First, let us take B = / , then Eq. (147) is valid with fi = 0, and Eq. (148) yields 


f H. 



(5.150) 


This equality means that the translation operator is unitary - not a big surprise, because if we shift a 
classical point on the complex phase plane by ( +a ) and then by (-a), we certainly must come back to the 
initial position. Relation (150) means merely that this fact is also true for any quantum state as well. 

Second, let us take B = a ; in order to verify Eq. (147) and find the corresponding //, let us 
calculate the commutator. Using, at the due stage of calculation, Eq. (104), we get 



a a-aa\a 


= -a 


A 

a' , a 


ai. 


(5.151) 


41 The proof of Eq. (148) may be readily achieved by expanding operator f(A) = exp|+ AAjB exp^- AAj in the 
Taylor series in the c-number parameter A, and then evaluating the result at A = 1. 
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so that in this case p= a, and Eq. (148) yields 


7~Jd7~ a =d + aI. (5.152) 

We have approached the summit of this beautiful calculation. Let us consider operator 

trjat- (5.153) 


Using Eq. (150), we may reduce this expression to dT a , while the application of Eq. (151) to the same 
expression yields 7~ a d + af a . Hence, we get the following operator equality 

af a =T a a + af a , (5.154) 


which may be applied to any state. Now acting by these two operators on the ground state |0) and using 
the facts that a |0) is the null-state, while f a | 0) = | a ) , we finally get a very simple and elegant result: 42 

(5.155) 



a 

S 

II 

a) . 


Thus any Glauber state is just one of eigenstates of the annihilation operator, namely the one 
with the eigenvalue equal to parameter a, i.e. to the complex representation (133) of the classical state 
which is the center of the Glauber state’s distribution. 43 This fact makes the calculations of the Glauber 


Glauber 
state as 
operator a’s 
eigenstate 


state properties much simpler. As the simplest example, let us use Eq. (155) to find (x) in the Glauber 
state: 


x ) = (a\x\a) = 


x n 


■ If 


vr 


a\\ a + a 


a = 


V2 


a\a\a) + {a\a^\a 


(5.156) 


In the first term in the parentheses, we can apply Eq. (155) directly, while in the second term, we can 
use the bra-counterpart of that relation, {a\d ] = [a \ a . Now assuming that the Glauber state is 
normalized, (a\a) = 1, and using Eq. (133), we get 

*o 

Acting absolutely similarly, we may readily extend this sanity check to verify that (p) = P, and that dx 
and Sp indeed obey Eq. (138). 

As a more thorough sanity check, let us use Eq. (155) to re-calculate Glauber state’s 
wavefunction (139). Inner-multiplying both sides of that relation by bra-vector (x|, and using definition 
(98a) of the annihilation operator, we get 



a\a\a) + (a\a \a 


))=^k( a + a *) 


V2 


= X. 


(5.157) 


42 It is also rather counter-intuitive. Indeed, according to Eq. (122), the annihilation operator a , acting on a Fock 
state n, “beats it down” to the lower-energy state (n - 1) - see Eq. (119). However, according to Eq. (155), its 
action on a Glauber state a does not lead to the state change and hence to an energy decrease! The resolution of 
this paradox may be achieved via representation of the Glauber state as a series of Fock states - see Eq. (165) 
below. Operator a indeed transfers each Fock component to a lower-energy state, but it also re-weighs each term 
of the expansion, so that the complete energy of the Glauber state remains constant. 

43 Note that the spectrum of eigenvalues a of eigenproblem (155) is continuous - it may be any complex number! 
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f 


V2x n 


x + i- 


V 


ma>. 


a) = a{x\a) . 


0 J 


(5.158) 


Since (x| is the bra-vector of the eigenstate of the Hermitian operator x , they may be swapped, with the 
operator giving its eigenvalue x; acting on that bra-vector by the (local!) operator of momentum, we 
have to use it in the coordinate representation (63). As a result, we get 


1 

V2x 0 


f 

x(x 

V 


a) + 


h 8 
mco 0 8x 


a 


= a{x 



(5.159) 


But (x| a) is nothing else than the Glauber state’s wavefunction 'Fa, so that Eq. (153) gives for it a first- 
order differential equation 




xHf. + - 


h 8 
m a> 0 8x 




= «T 


(5.160) 


Chasing 'Fa and x to the opposite sides of the equation, and using definition (133) of parameter a, we 
may bring this equation to a form 


8^ 


ma> 0 

h 



r v .p) 

— X + 

JC + i 


l mcoJ_ 


(5.161) 


Integrating both parts, we return to Eq. (139) that had been derived by wave-mechanics means. 

Now that we can use Eq. (155) for finding coefficients a n in the expansion (136) of the Glauber 
state a in series over the Fock states n. Plugging Eq. (136) into each side of Eq. (155), using the first of 
Eq. (122) in the left-hand part, and requiring the coefficients at each ket-vector | n) in both parts to be 
equal, we get the following recurrence relation for the coefficients: 


a n + 1 = 


a 


(n + 1) 


1/2 


a. 


(5.162) 


Assuming some value of «o, and applying the relation sequentially for n = 1, 2, etc., we get 

a" 


cc„ = 


(n!) 


1/2 « 0 - 


(5.163) 


Now we can find «o from the nonnalization requirement (137), getting 


a n 


! I 

n = 0 


\a\ 


1 2 n 


= 1 . 


n\ 


(5.164) 


7 

In this sum, we may readily recognize the Taylor expansion of function exp{|a|“}, so that the final 
result (besides an arbitrary common phase multiplier) is 


Glauber 
state as 
Fock states’ 
superposition 


\a\ 


a) = exp< 


a 


1/2 



(5.165) 
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It means in particular that the probability W„ = a n a n * of finding the system energy on «-th 
energy level (119) obeys the well-known Poisson distribution (Fig. 7): 



{n) = \a\ 2 . (5.167) 


For applications, perhaps the most important mathematical property of this distribution is 


Sn = (n 


1/2 


(5.168) 

note also that at (n''/ » 1 , and hence Sn « (n'j , the Poisson distribution approaches the Gaussian 
(“normal”) one. 


Now let us discuss the evolution of the Glauber state in time. In the Schrodinger language, it is 
completely described by dynamics (131) of the c-number shifts X{t ) and P(t) participating in 
wavefunction (139). Note again that, in contrast to the spread of the wave packet of a free particle, 
discussed in Sec. 2.2, in the hannonic oscillator the Gaussian packet of special width (138) does not 
spread at all! 

An alternative and equivalent way of dynamics description is to use the Heisenberg equation of 
motion. As Eqs. (42) and (48) tell us, such equations for Heisenberg operators of coordinate and 
momentum they have to be similar to the classical equation (131): 



Ph = 


(5.169) 


Now using Eqs. (98), for the Heisenberg-picture creation and annihilation operators we get equations 


d l q Cl j_j ^ 




't 
l » ■> 


(5.170) 


that are completely similar for the classical equation (134) for the c-number parameter a and its 
complex conjugate, and hence have the solutions identical to Eq. (135): 


Poisson 

distribution 


r.m.s. 

fluctuation 
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a H (» = a H (0)e , 


, N „ + icoJ 

a' H (t) = a' H (0)e 


(5.171) 


As was discussed in Sec. 4.6, such equations are very convenient because they enable simple 
calculation of time evolution of observables for any initial state of the oscillator (Fock, Glauber, or any 
other) using Eq. (4.191). Applied to a Glauber state «(0), such calculation gives the same results as have 
already been derived earlier in this section, in particular confirms that the Gaussian wave packet of the 
special width (138) does not spread in time. 


Squeezed 

ground 

state 


Now let us consider what happens if the initial wave packet is still Gaussian, but has a different 
width, say 8x < X 0 /V 2 . As we already know from Sec. 2.2, the momentum spread 8p will be 
correspondingly larger, still with the smallest uncertainty product: 8x8p = fit. 2. Such squeezed ground 
state *, with zero expectation values of x and p, may be generated from the Fock/Glauber ground state: 


= A | o ). 


(5.172a) 


Squeezing 

operator 


using the so-called squeezing operator. 



(5.172b) 


which depends on a complex c-number parameter * = re ,e . Parameter’s modulus r determines the 
squeezing degree; it is straightforward to use Eq. (172) for checking that if * is real (6 = 0, q = r), then 


8x = ^= ~ 

42 


r 

e = 


f n ^ 

1/2 

_ r UICOqXq r 

' tun a> 0 N 

{2ma) 0 ) 

1 

* 

0 

l 2 J 


1/2 


e , so that 8x8p = — . (5.173) 


On the phase plane (Fig. 6), this state, with r > 0, may be represented by an oval spot squeezed 
along axis x (hence the state’s name) and stretched along axis p; the same formulas but with r < 0 
describe the opposite squeezing. On the other hand, phase 0 of the squeeze parameter *»• determines the 
angle 6 12 of oval’s turn about the phase plane origin - see the magenta ellipse in Fig. 6; if 6^ 0, Eqs. 
(173) are valid for variables {x’,p’} obtained from {x, p} via clockwise rotation by that angle. For any 
of such origin-centered squeezed states, time evolution is reduced to an increase of the angle with rate 
a>o, i.e. to the clockwise rotation of the ellipse, without its deformation, with angular velocity coo - see 
the magenta arrows in Fig. 6. As a result, uncertainties 8x and Sp oscillate in time with double frequency 
2 coo, while their product is constant at its minimal possible value fit 2. 

Such squeezed ground states have important implications for quantum measurements (see Sec. 
7.7 below) and may be formed, for example, by parametric excitation of the oscillator, 44 with a 
parameter modulation depth close to, but still below the threshold of parametric oscillations excitation. 
Unfortunately, I do have time for a further discussion of this interesting topic, 45 but still need to mention 


44 For a discussion and classical theory of this effect, see, e.g., CM Sec. 4.5. 

45 See, e.g., Chapter 7 in C. Gerry and P. Knight, Introductory Quantum Optics, Cambridge U. Press, 2005, and 
the spectacular measurements of the Glauber and squeezed states of electromagnetic (light) oscillators by G. 
Breitenbach et at.. Nature 387 , 471 (1997), very large (ten-fold) squeezing in such oscillators by H. Vahlbruch et 
at., Phys. Rev. Lett. 100 , 033602 (2008); and recent first measurements of the (so far, slight) squeezing in 
mechanical resonators, with eigenffequency ayU n as low as 3.6 MHz, by E. Wollman et at.. Science 349 , 952 
(2015). 
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a more general class of squeezed states, centered to an arbitrary point {X, P} rather than the origin, that 
may be formed by an additional action of the displacement operator (144) on the squeezed ground state 
(172). Calculations similar to those that led us from Eq. (145) to Eq. (155), but now for the product 
operator T a S s rater than bare T a , show that such a general squeezed state is an eigenstate of the 
following mixed operator 


with eigenvalue 


r ~ i ''fib 
b = a cosh r + a' e 


* i9 

/? = acoshr + a e 


sinhr, (5.174a) 

sinh/-. (5.174b) 


For the particular case a= 0, Eq. (174b) yields /?= 0, i.e. the action of operator (174a) on the squeezed 
ground state * with the same r and 6 yields the null-state, thus generalizing Eq. (118), which is valid for 
the “usual” (non-squeezed) ground state. 


5.6. Revisiting spherically-symmetric problems 

One more blank spot to fill has been left in our study of wave mechanics of spheric ally-3 D 
symmetric systems in Sec. 3.6. Indeed, while the eigenfunctions describing axially- symmetric 2D 
systems, and the azimuthal components of those in spherically-symmetric 3D systems, are very simple, 


V,n 



imcp 


m = 0 , ± 1 , ± 2 ,... 


(5.175) 


the polar components of the eigenfunctions in the latter case (i.e., of spherical harmonics) include the 
associate Legendre functions P/’X cos 6) that may be expressed via elementary functions only indirectly 
- see Eqs. (3.165) and (3.168). This makes all the calculations less than transparent and, in particular, 
does not allow a clear insight into the origin of the very simple eigenvalue spectrum - see, e.g., Eq. 
(3.163). The bra-ket formalism, applied to the angular momentum operator, allows one to get such 
insight, and also produces a very convenient tool for many calculations involving spherically-symmetric 
potentials. 

Let us start from using the correspondence principle to spell out the quantum-mechanical 
operator of the orbital angular momentum L = rxp of a point particle: 


Angular 

( 5 . 176 ) momentum 
operator 


From this definition, we can readily calculate the commutation relations for all Cartesian components of 
operators L,r, andp ; for example, 

A ’ y\ = [yPz - zpy > y] = ~Ah > y\ = (5.177) 



n * n v n - 


L = r xp = 

x y z 
P, Py h 

, i.e. L x =yp. -zp y , etc., 


etc. Using the sequential numbering of coordinate axes (x = r\, etc.), the summary of these calculations 
may be presented in similar, compact (and beautiful!) forms: 
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hty. 

— ifir f 

mr j"bjfj" 9 

f'j ’ Pf ' . 

II 

JSi 

V*. 

Cv> 


li 

Sf 

' 


(5.178) 


where each of indices j and j ’ and j” may take any of values 1, 2, and 3 ,j” is the complementary index 
of the same set (not equal to either j or j’), and Sy-y is the Levi-Civita symbol (or “permutation 
symbol”). 46 Also introducing in the natural way a (scalar!) operator of the observable L 2 = I L p, 


Operator 
of L 2 


L 2 =L 2 x +L 2 y + L], 


(5.179) 


it is straightforward to check that this operator commutes with each of the Cartesian components: 



(5.180) 


This result, at the first sight, may seem to contradict the last of Eqs. (178). Indeed, haven’t we 
learned in Sec. 4.5 that commuting operators (e.g., Lr and any of L . ) share their eigenstate sets? If yes, 

shouldn’t that mean that this set has to be common for all 4 operators? 47 The resolution in this paradox 
may be found in the condition that was mentioned just after Eq. (4.138), but (sorry!) not sufficiently 
emphasized there. According to that relation, if an operator has degenerate eigenstates (i.e. if Aj = Ay 
even for j ^ j ’), they should not be necessarily shared by another compatible operator. This is exactly the 
situation with the orbital angular momentum operators, that may be schematically shown at a Venn 
diagram (Fig. 8): 48 the set of eigenstates of operator L is highly degenerate, 49 and is broader than those 
of the component operators Z / (that, as will be shown below, are non-degenerate until we consider 
particle’s spin). 



Fig. 5.8. Venn diagram showing (schematically) the 

partitioning of the set of eigenstates of operator Is . Each 
inner sector corresponds to the states shared with one of 

Cartesian component operators Lj , while the outer 

(shaded) ring presents the eigenstates of V that are not 
shared with either of L j - e.g., all linear combinations of 
eigenstates of different component operators. 


46 See, e.g., MA Eq. (13.2). 

47 The importance of this issue stems from the following fact: it is easy (and is hence left to the reader :-) to use 
Eqs. (5.178) to prove that operators of all Lj and of L 2 commute with the Hamiltonian of a particle in the 
spherically-symmetric potential U{r), and hence all their eigenstates are the stationary states in such a field. 

48 This is just a particular example of Venn diagrams (introduced in the 1880s by J. Venn) that show possible 
relations (such as intersections, unions, complements, etc.) between various sets of objects, and are a very useful 
tool in the general set theory. 

49 Note that this particular result is consistent with the classical picture of the angular momentum vector: even 
when is length is fixed, the vector may be oriented in various directions, corresponding to different values of its 
Cartesian components. However, in the classical picture, all these component may be fixed simultaneously, while 
in the quantum picture this is not true. 
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/\ 

Let us focus on just one of these 3 joint sets of eigenstates - by tradition, of operators L and L z . 
(This tradition is due to the canonical form of spherical coordinates, in which the polar angle is 
measured from axis z. Indeed, using Eqs. (63), in the coordinate representation we get the following 
expression, 


L z =xp -yp x =x 



f + d 1 


f a) 

X 

-ifi — 

- y 

-ih — 



l dx) 



(5.181) 


Writing the standard eigenproblem for the operator in this representation, L z y/ m = L z \j/ m , we see that it 


is satisfied by eigenfunctions (175), with eigenvalues L z = Tim - at was already conjectured in Sec. 3.5.) 
More specifically, let us consider a set of eigenstates {/, in } corresponding to a certain degenerate 

/V- /\ 

eigenvalue of operator L but all possible eigenvalues of operator L z , i.e. all possible quantum numbers 


m. (At this point, / is just some parameter that determines the eigenvalue of L 2 ; it will be defined more 
explicitly in a minute.) In order to analyze this set, it is instrumental to introduce the so-called ladder 
(also called, respectively, “raising” and “lowering”) operators 


L. = L ± iL , 


(5.182) 


- note a substantial similarity between this definition and Eqs. (98). It is straightforward to use this 
definition and the last of Eqs. (178) to calculate the following commutators: 








1 -.. 1 - . 

= 2 TiL z , and 

[4,4] 

= ±hL ± , 


and use Eq. (179) to prove another important relation: 


L = TiL +L 2 +L L. 

z z — + 


Now let us rewrite the last of Eqs. (183) as 


(5.183) 

(5.184) 


44 = L ± L z ± TiL ± , 


(5.185) 


and act by its both parts on the ket-vector | /, m) of the set specified above: 

44 | /, m'j = 44 | /, m) ± TiL ± | /, m}. 


(5.186) 


Since eigenvalues of operator L_ are equal to Tim, in the first term of the right-hand part we may write 

L z \l,mj = Tim\l,m). (5.187) 

With that, Eq. (186) may be recast as 

L z [L ± | l,m))= h(m ± 1 )(f ± | /, m)} (5.188) 

In a spectacular similarity with Eqs. (11 1)-(1 12) for the harmonic oscillator, Eq. (188) means 
that states L ± \l,m) are also the eigenstates of operator L z , corresponding to eigenvalues (m ± 1). Thus 

the ladder operators act exactly as the creation and annihilation operators in the oscillator, moving the 
system up or down a ladder of eigenstates (Fig. 9). The most significant difference is that now the state 


Ladder 
operators 
and main 
commutation 
relations 


Important 

commutation 

relations 
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ladder must end in both directions, because an infinite increase of \m\ , with whatever sign, would cause 
the expectation values of operator 


l 2 x +l 2 =l 2 -l]. 


(5.189) 


which corresponds to a non-negative observable, to become negative. Hence there should be two states 
on both ends of the ladder, | /, m max ) and | /, /n mm ), for whom 

L + \l,m m!lx } = 0, L_\l,m min ) = 0. (5.190) 


Relation 
between 
m and / 


Due to the symmetry of the whole problem with respect to the replacement m — > -in, we should have 
m m in = - /Umax- This m nrdX is exactly the quantum number that is traditionally called /, so that 


- / < m < +/. 


(5.191) 


eigenket 


eigenvalue of L z 
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Fig. 5.9. Hierarchy (ladder diagram) of the 

^2 " v 

co mm on eigenstates of operators L and L, . 


Evidently, this relation of quantum numbers m and / is compatible with the almost-classical 

image of various orientations of the angular momentum vector of the same length in various directions, 

2 

with its z-component taking several (2/ + 1) possible values fun. In this simple picture, however, L 
would be equal to square of (L z ) max , i.e. to (til) ; however, this is not so. Indeed, applying the operator 
equality (184) to the top state | /, m max ) = \l, /), we get 


Eigenvalues 
of L 2 


Since by our initial assumption, all eigenvectors | /, m) correspond to the same eigenvalue of operator L 2 , 
this result means that all these eigenvalues are equal to fi 1(1 +1). Just as in case of the spi 11-/2 vector 
operators, the deviation of this result from fi l may be interpreted as the result of unavoidable 
uncertainties (“fluctuations”) of the x- and y-components of the angular momentum, that give a finite 
positive contribution to L even if the angular momentum vector is aligned in the best possible way with 
the z-axis. 


L 2 

l,l) = fiL, 
= fi 2 l{i 

l,l) + L) 

’+qu). 

l,l) + L_L + 

l,l) = h 2 l\l,l) + h 2 l 2 

1,1 ) + 0 


(5.192) 
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Now let us compare our results with those of Sec. 3.6. Using the expression of Cartesian 
coordinates via the spherical ones exactly as was done in Eq. (181), we get the following expressions for 
the ladder operators (182) in the coordinate represent atio n: 


(5.193) 



Now plugging this equation, together with Eq. (181), into Eq. (184), we get 


L =-h- 


1 


sin# <3# 


d ( d 

sin# — 

do 


V 


+ - 


1 


sin 2 # dcp 1 


(5.194) 


Coordinate 
representation 
of angular 
momentum 
operators 


But this is exactly the operator (besides its division by constant parameter 2mR) that stands in the left- 
hand part of Eq. (3.156). Hence that equation, which was explored by the “brute-force” (wave- 
mechanical) approach in Sec. 3.6, may be understood as the eigenproblem for operator L 2 in the 
coordinate representation, with eigenfunctions 7/"(#,^) corresponding to eigenkets {/, m } , and 
eigenvalues L 2 = ImRE. As a reminder, the main result of that, rather involved analysis was expressed 
by Eq. (3. 163), which now may be rewritten as 

L] = 2 mR 2 E, = h 2 l(l + 1) , (5.195) 


in a full agreement with what was obtained in this section by much more efficient means based on the 
bra-ket formalism. In particular, it is fascinating to see how easy are now many operations with 
eigenvectors | /, m), albeit wavefunctions of these states, spherical harmonics Y/"(0,(p), have rather 
complex spatial behavior - please have one more look at Eq. (3.171) and Fig. 3. 19. 50 


5.7. Spin and its addition to orbital angular momentum 


Surprisingly, the theory described in the last section is useful for much more than orbital motion 
analysis. In particular, it helps to generalize the spin-Vi results discussed in Chapter 4 to other values of 
spin s - the parameter still has to be defined. For that, let us notice that the commutation relations that 
were derived, for s = Vi, from the Pauli matrix properties, may be rewritten in exactly the same form as 
Eqs. (178) and (180) for the orbital momentum: 
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(5.196) 


Spin 

operators: 

commutation 

relations 


It has been postulated (and confirmed by numerous experiments) that these relations hold true 
for any quantum particle. Now note that all the calculations of the last section have been based almost 
exclusively on such relations - the exception will be discussed imminently. Hence, we may repeat them 
for spin operators, and get the relations similar to Eq. (187) and (192): 


50 The reader is challenged to use the commutation relations discussed above to prove one more important 

/v ~2 

property of the co mm on eigenstates of operators L_ and L : 

(/, m | fj | /', m = 0, if either /' ^ l ± 1, or m ^ m ' , or both. 

This property is the basis of the selection rules for dipole quantum transitions, to be discussed later in the course, 
especially in Sec. 9.3. 
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Spin 

operators: 

eigenstates 

and 

eigenvalues 


K 

s , m s ) = fim s 

s,m s ), S 2 

s,m s ) = h 2 s(s + 1) \s,m\ 0 <s, -s <m s <+s , 


(5.197) 


where m s is a quantum number similar to the orbital number m, and the non-negative constant s is 
defined as the maximum value of \m s \. This parameter is exactly what is called particle’s spin - in the 
narrow sense of the word. 


Now let us return to the only part of our orbital moment calculations that has not been derived 
from the commutation relations. This was the fact, based on solution (175) of the orbital motion 
problems, that quantum numbers m (the analog of m s ) are integer. For spin, we do not have such a 
solution, so that the spectrum of numbers m s (and hence its limits ± 5 ) should be found from the more 
loose requirement that the eigenstate ladder, extending from -s to + s, has an integer number of steps. 
Hence, 2s has to be integer, i.e. spin 5 of a quantum particle may be either integer (as it is, for example, 
for photons and gluons), or half-integer (e.g., for all quarks and leptons including electrons). 51 

For s = V 2 , this picture yields all spin properties of electron that were derived in Chapter 4 from 
postulate (4.117). In particular, operators S and S', have only 2 common eigenstates, with S : = hm s = 

±h/ 2, and both with S = s(s +1 )h = (3/4 )h . Note that this analogy with the angular momentum sheds a 
new light on the symmetry properties of electrons. Indeed, the fact that m in Eq. (175) is integer was 
derived in Sec. 3.5 from the requirement that making a full circle around axis z, we should find a similar 
value of wavefunction y/ m , which differs from the initial one by an inconsequential factor exip{2mm}. 
With the replacement m — > m s = ±'A, such operation would multiply the wavefunction by exp{±;h}, he. 
reverse its sign. On course, spin cannot be described by a usual wavefunction, but this odd parity of 
electrons (and all other spin -'/2 particles) is clearly revealed in multiparticle systems - see Chapter 8. 

Now we are sufficiently equipped to analyze particles that have both the orbital momentum and 
the spin. In classical mechanics, such a particle would be characterized by the total angular momentum 
vector J = L + S. Following the correspondence principle, we may make an assumption that quantum- 
mechanical properties of this observable may be analyzed using the similarly defined vector operator: 

Total 
angular 
momentum 
operator 

with Cartesian components 

J z = + S z , (5.199) 

etc, and the magnitude squared equal to 

J 1 =J\ +J 2 y +d 2 z . (5-200) 


J = L + S , 


(5.198) 


Let us examine the properties of this vector operator. Since its two components describe 
different degrees of freedom of the particle (again, you may say “belong to different Hilbert spaces”), 
they may be considered as completely commuting: 




(5.201) 


51 As a reminder, in the Standard Model of particle physics, such hadrons as mesons and baiyons (notably 
including protons and neutrons) are essentially composite particles, with the spin equal to the sum of its 
component quark spins. However, at nonrelativistic energies, protons and neutrons may be considered 
fundamental particles with 5 = Vi. 
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These above equalities are sufficient to derive the commutation rules of the total angular momentum, 
and, not surprisingly, they turn out to be absolutely similar to those of its components: 

Total 

momentum: 

(5.202) commutation 
relations 

Now repeating all arguments of the last section, we may derive the following expressions for the 

/V - /V 

common eigenstates of operators J and J : 

Total 

momentum: 

(5.203) eigenstates, 
and 

where j and ny are new quantum numbers. Repeating the arguments made for m s , we may conclude that j eigenvalues 
and nij may be either integer or half-integer. 

Before we proceed, one remark on notation: it is very convenient to use the same letter m for 
numbering eigenstates of all momentum components participating in Eq. (199), with corresponding 
indices (j, /, and s ), in particular, to replace what we called m with ni/. With this replacement, the main 
results of the last section may be summarized in the fonn similar to Eqs. (197) and (203): 

L z \l, m l ) = tim l \l,m l }, = h 2 l(l + X)\ 0 < /, -l<m,<+l. (5.204) 

In order to understand which eigenstates used is Eqs. (197), (203), and (204) are compatible with 
each other, let us use Eqs. (198)-(202) to calculate the mutual commutators of the operators squared and 
their z-components. The result is 


J, 

m. 'j = Tmij 

;.*/), J 2 

J> m j) 

= h 2 j(j + 1) j , mj ), 0 < j, - j<mj< +./, 




ii 

0* 

rs 

o 

II 


J\L 2 
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J 2 ,s 2 _ 

= 0, 

(5.205) 

j 2 ,l : 

* o, 

J 2 ,S : 

*0. 

(5.206) 


This result may be presented schematically on the following Venn diagram (Fig. 10), in which the 
crossed arrows indicate the only non-commuting pairs of operators. 



operators 
diagonal in 
the coupled 
representation 


Fig. 5.10. Venn diagram for angular momentum 
operators, and their mutually-commuting groups. 


This means that just as for each component angular momentum (J, L, and S) considered 
separately we could select a group of common eigenstates for its magnitude squared and the z- 
component, we also may find eigenstates shared by two broader groups of operators, encircled with 
colored lines in Fig. 10. The first group (within the red circle), consists of all operators but J . This 
means that there are eigenstates shared by 5 remaining operators, and they may be characterized by 
certain values of the corresponding quantum numbers: l, mi, s, m s , and ny. Actually, only 4 of these 
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numbers are independent, because due to Eq. (199) for these compatible operators, for each eigenstate of 
the group, their “magnetic” quantum numbers m have to satisfy the following relation: 

THj = m l + m s . (5.207) 

Hence the common eigenstates of the operators of this group are fully defined by just 4 quantum 
numbers, for example, l, mi, s, and m s . For some calculations, especially those for systems whose 
Hamiltonians include only operators of this group, it is convenient to the use this set of eigenstates as 
the basis; frequently this is called the uncoupled representation. 

However, in some situations we cannot ignore interactions between the orbital and spin degrees 
of freedom (in the common jargon, the spin-orbit coupling), which leads in particular to splitting (called 
the fine structure) of atomic energy levels even in the absence of external magnetic field. I will discuss 
these effects in detail in the next chapter, and now will only note that they may be described by a 

/V a 

separate term, proportional to product L • S , in the system’s Hamiltonian. If this term is not negligible, 
the uncoupled representation becomes inconvenient. Indeed, writing 

J 2 =(L + S) 2 =L 2 +S 2 +2L-S, (5.208) 


and looking at Fig. 10 again, we see that the operator L • S , describing the spin-orbit coupling, does not 
commute with operators L_ and S, . This means that stationary states of the system with such tenn in 
the Hamiltonian do not belong to the uncoupled representation basis. On the other hand, Eq. (208) 
shows that operator L • S does commute with all 4 operators of another group, encircled with the blue 
line in Fig. 10. According to Eqs. (201), (202), and (205), all operators of that group also commute to 
each other, so that they have common eigenstates that may be marked by the corresponding quantum 
numbers, l, s,j, and ny. This group is the basis for the coupled representation of particle’s state. 


Excluding the quantum numbers / and s, common for both groups, from notation, it is convenient 
to denote the common ket-vectors of each group as, respectively, 


Coupled and 
uncoupled 
bases 


m l , m s ), for the uncolpled representation' s basis, 
j, m . y for the coupled representation' s basis. 


(5.209) 


As we will see in the next chapter, for solution of some important problems (e.g., the fine structure of 
atomic spectra and the Zeeman effect), we will need the relation between the kets \j, mf and the kets | mi, 
m s ). This relation may be represented as the usual linear superposition, 


Definition of 
Clebsch- 
Jordan 
coefficients 


j > m , ) = Z I m i ’ m s ){ m i ’ m s | j> m j ) » 

m i ,m s 


(5.210) 


whose bra-kets (c-numbers), essentially the elements of the unitary matrix of the transformation between 
two eigenstate bases (209), are called the Clebsch-Gordan coefficients. 

The best (though imperfect) classical interpretation of Eq. (210) I can offer is as follows. If the 
lengths of vectors L and S (in quantum mechanics associated with numbers / and 5, respectively), and 
also their scalar product LS, are all fixed, then so is the length of vector J = L + S (whose length in 
quantum mechanics is described by quantum number j). Hence, the classical image of a specific 
eigenket J j, mfi, in which /, s, j, and ny are all fixed, is a state in which L , S , J~, and J-_ are fixed. 
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However, this fixation still allows for an arbitrary rotation of the pair of vectors L and S (with a fixed 

2 

angle between them, and hence fixed LS and J ) about the direction of vector J - see Fig. 11. 




Fig. 5.11. Classical image of two 
quantum states with the same l, s,j, 
and irij , but different mi and m s . 


Hence the components L : and S z in these conditions are not fixed, and in classical mechanics 
may take a continuum of values, two of which (with the largest and smallest possible values of S z ) are 
shown in Fig. 1 1 . In quantum mechanics, these components are quantized, with their states represented 
by eigenkets | mi, mj), so that a linear combination of such kets is necessary to represent ket ) j, mj). This 
is exactly what Eq. (210) does. 

Some of properties of the Clebsch-Gordan coefficients (mi, m s \j, mj ) may be readily established. 
For example, the coefficients do not vanish only if the involved magnetic quantum numbers satisfy Eq. 
(207); let us prove this fact. 52 All matrix elements of the null-operator 

J s -(L z +S z ) = 6 (5.211) 

should equal zero in any basis; in particular 

(y,m y | J z ~(L z +S z )\m,,m s ) = 0. (5.212) 


Acting by operator J, on the bra-vector, and by the sum (L z + S z ) on the ket-vector, we get 

[nij - (z n, + m s )](./, m j | m, ,m s ) = 0, (5.213) 


thus proving that (mi, m s \j, mj) = (j, mjmi, mj) = 0, if ny - (mi + mj) ^ 0. 


For the most important case of spin- 14 particles (s = 14, and hence m s = ±14), whose uncoupled 
representation basis includes 2x(2 / + 1) states, restriction (207) enables the representation of all 
nonvanishing Clebsch-Gordan coefficients on the simple diagram shown in Fig. 12. Indeed, each 
coupled-representation eigenket | j, mj), with ny = mi + m s = mi ± 14, may be related with non-zero 
Clebsch-Gordan coefficients to at most two uncoupled-representation eigenstates | mi, mj). Since mi may 
only take integer values from -/ to +/, m, may only take semi-integer values on the interval [- / - 14, / + 
14]. Hence, by the definition of j as (mj) max, its maximum value has to be / + 14, and for ny = / + 14, this 
is the only possible value. This means that the uncoupled state with m/ = I and m s = 14 should be 
identical to the coupled-representation state with j = l + 14 and m, = l + 14: 


. , 1 
J=l + 2’ m J 



m. = m , , m 

i j 2 s 



(5.214) 


52 One may thi nk that Eq. (207) is a trivial corollary of Eq. (199). However, now we should be a bit more careful, 
because in the Clebsch-Gordan coefficients, these quantum numbers characterize different groups of eigenstates. 
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Fig. 5.12. All possible sets of eigenvalues mi, m s for a particle with fixed /, and 5 = Vi. Each 
uncoupled-representation state is represented by a dot, while each the coupled-representation state, 
by a single sloped line connecting the dots. 


However, already for the next value, rrij = / - Vi, we need to have two values of j, so that two | mi, 
m s ) kets is to be related to two J j, mj) kets by two Clebsch-Gordan coefficients. Since / changes in unit 
steps, these values of j have to be / ± Vi. This choice, 

j = l± 1/2, (5.215) 


evidently satisfies all lower values of nij (again, with only one value, / = / + Vi, necessary for the lowest 
mj = -l - Vi) - see Fig. 12. Note that the total number of the coupled-representation states is 1 + 2x2/ + 1 
= 2(2/ + 1), i.e. the same as in the uncoupled representation. So, each sum (210), for fixed j, m, (and 
fixed common parameter /), has at most 2 terms, i.e. involves at most 2 Clebsch-Gordan coefficients. 


These coefficients may be calculated in a few steps, all but the last one rather simple even for 
arbitrary spin s. First, the matrix elements of ladder operators L ± in the standard z-basis (i.e. in the basis 
of kets | mi)) may be calculated from Eq. (184). Next, the similarity of vector operators JandS to 
operators L , expressed by Eqs. (197), (203), and (204), may be used to argue that the matrix elements 
of operators S ± and J ± , defined absolutely similarly to L ± , have similar matrix elements in the bases 
of kets | m s ) and | mj), respectively. After that, acting by operator J ± = L ± + S ± upon both parts of Eq. 

(210), and then inner-multiplying the result by the bra vector (mi, m s \ and using the above matrix 
elements, we get recurrence relations for the Clebsch-Gordan coefficients. Finally, these relations may 
be recurrently applied to the adjacent states in both representations, starting from any of the two states 
common for them - for example, from state with ket-vectors (214), corresponding to the top right point 
in Fig. 12. Let me leave these straightforward but a bit tedious calculations for reader’s exercise and just 
cite the final result of this procedure for s = Vi : 53 


Clebsch - 
Gordan 
coefficients 
for s = 'A 


m, = m ,m s = + — 


m, = m ■ + —,m c = 

\ 1 1 2 s 2 


j = l± \’ m J 


= + 


r l±m : + 1/2 V/2 


2 / + 1 


j = l±-,m j 


2 


O + m. + 1/2^ 1/2 


(5.216a) 


= + 


2 / + 1 


53 For arbitrary spin 5, the calculations and even the final expressions for the Clebsch-Gordan coefficients are 
rather bulky. They may be found, typically in a table form, mostly in special monographs - see, e.g., A. R. 
Edmonds, Angular Momentum in Quantum Mechanics, Princeton U. Press, 1957. 
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For applications, it may be more convenient to use this result in the following equivalent fonn: 




= + 


( l ± nij + 1 / 2 ^ 
2 / + 1 , 


1/2 . 


1 l \ 

m, = m — ,m, = + — > + 

1 2 2/ 


f I T 


/ + m j + 1/2 
2 / + 1 


, 1/2 


1 1 

m, = m- + — , m = — 


(5.216b) 


We will use this relation in Sec. 6.4 for an analysis of the anomalous Zeeman effect, based on the 
perturbation theory. Moreover, most of the angular momentum addition theory described above is 
immediately applicable to the addition of angular momenta in multiparticle systems, so we will revisit it 
in Chapter 8. 

To conclude this section, I have to note that the Clebsch-Gordan coefficients (for arbitrary s ) 
participate also in the so-called Wigner-Eckart theorem that expresses matrix elements of certain 
spherical tensors, in the coupled-representation basis [ j, mj), via a reduced set of matrix elements. 
Unfortunately, a discussion of this theorem and its applications would require a higher mathematical 
background than I can expect from my readers, and more time/space than I can afford. 54 


5.8. Exercise problems 

5.1 . Use the discussion of Sec. 1 to find an alternative solution of Problem 4.17. 

5.2 . A two-level system is in a quantum state a, described by ket-vector | a) = «t|T) + aji), with 
given (generally, complex) c-number coefficients an. Prove that we can always select a 3-component 
vector a = {a x , a y , a z \ of real c-numbers, such that a is an eigenstate of operator a • a , where a is the 
operator described, in 2 -basis, by the Pauli matrix vector. Find all possible values of a satisfying this 
condition, and the second eigenstate of operator a • a , orthogonal to the given a. Give a Bloch-sphere 
interpretation of your result. 


5.3 . A spin- 14 particle is in a constant vertical field, so that its Hamiltonian 

Pr_hn „ 

2 ° 2 ' 

but its spin’s initial state is an eigenstate of a different Hamiltonian: 55 

H ini = a a = a x o x + a y a y + a_a : . 

Use any approach you like to calculate the time evolution of the expectation values of the spin 
components. Interpret the results. 


5.4 . For any periodic motion of a single particle in a confining potential U{ r), the virial theorem 
of nonrelativistic classical mechanics 56 is reduced to the following equality: 


54 For the interested reader I can recommend, either Sec. 17.7 in E. Merzbacher, Quantum Mechanics, 3 rd ed., 
Wiley, 1998, or Sec. 3.10 in J. Sakurai, Modern Quantum Mechanics, Addison-Wesley, 1994. 

55 Cf. Problems 4.22, 4.23, 5.2. 

56 See, e.g., CM Problem 1.12. 
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T = — r - VU , 

2 

where T is particle’s kinetic energy, and the top bar means averaging over the period of motion. Prove 
the quantum-mechanical version of the theorem for an arbitrary stationary quantum state, in the absence 
of spin effects: 

( T )=\( r ' wu )’ 

where the angular brackets mean, as everywhere in these notes, the expectation value of the variable 
inside them. 

Hint : Mimicking the proof of the classical virial theorem, consider the time evolution of operator 

G = r p. 

5.5 . A constant force F is applied to an (otherwise free) ID particle of mass in . Calculate the 
eigenfunctions of the problem, using 

(i) the coordinate representation, and 

(ii) the momentum representation. 

Discuss the relation between the results. 

5.6 . The momentum representation of an operator, defined in the Hilbert space of ID orbital 
states of a particle, equals p x . Find its coordinate representation. 

5.7 . * For a particle moving in a 3D periodic potential, develop the bra-ket formalism for the q- 
representation, in which a complex amplitude similar to a q in Eq. (2.234) (but generalized to 3D and all 
energy bands) plays the role of the wavefunction. In particular, calculate operators r and v in this 
representation, and use the result to prove Eq. (2.237) for ID motion in the low-field limit. 

Hint : Try to generalize the analysis of the momentum representation in Sec. 5.2. 

5.8 . In the Heisenberg picture of quantum dynamics, find the operator of velocity and 
acceleration, 

dr , „ dx 

v = — and a = — , 
dt dt 

of an electron moving in an arbitrary electromagnetic field. Compare the results with the corresponding 
classical expressions. 

5.9 . Calculate, in the WKB approximation, the transmission coefficient T for tunneling of a 2D 
particle with energy E < Uo through a saddle-shaped potential “pass” 

U(x,y) = U 0 ( 1 + ^1 

V a ) 

where Uo and a are real constants. 
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5.10 . Calculate the so-called Gamow factor 57 for the alpha decay of atomic nuclei, i.e. the 
exponential factor in the transparency of the tunnel barrier, resulting from the following simple model of 
the particle’s potential energy as a function of its distance from the nuclear center: 


U(r) = 


'U 0 < 0 , 
< ZZ'e 2 
4 7T£ 0 r ’ 


for r < R, 
for R <r, 


(where Ze = 2e > 0 is the charge of the alpha-particle, Z’e > 0 is that of the nucleus after the decay, and 
R is the nucleus’ radius), in the WKB approximation. 


5JJ_ . For a ID harmonic oscillator with mass in and frequency a>o, calculate: 

(i) all matrix elements (n\x 3 \ , and 

(ii) diagonal matrix elements (n \x 4 1 n) , 
where n are the Fock states. 


5.12 . Calculate the sum (over all n > 0) of the so-called oscillator strengths, 

fn 

of quantum transitions between the n th energy level and the ground state, for 

(i) a ID harmonic oscillator, and 

(ii) a ID particle confined in an arbitrary stationary potential. 


5.13 .* Prove the so-called the Bethe sum rule, 


E(£, -£ 




trk 2 
2 m 


(where k is a c-number), valid for a ID particle moving in an arbitrary time-independent potential U(x), 
and discuss its relation with the Thomas-Reiche-Kuhn sum rule, whose derivation was the subject of the 
previous problem. 


Hint. Calculate the expectation value, in a stationary state n, 
commutator, 


D = 


\H,e 


ikx 


-ikx 


of the following double 


in two ways - first, just spelling out both commutators, and, second, using the commutation relations 
between operators p and e ±lkx , and compare the results. 


5.14 . Simplify the following operators: 
(i) exp{+ iax \ p x cxp{- iax) , and 


57 Named after G. Gamow, who made this calculation as early as in 1928. 
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(ii) exp{+ iap x } x exp{- iap x } , 
where a is a c-number. 

5.15 . Use the Heisenberg equation of motion for a direct derivation of time evolution law (5.171) 
of the creation and annihilation operators of a harmonic oscillator. 


5.16 . Calculate: 

(i) the expectation value of energy, and 

(ii) the laws of time evolution of expectation values of the coordinate and momentum 
for a ID harmonic oscillator, provided that in the initial moment ( t = 0) it was in state 

where \n) are ket-vectors of the stationary (Fock) states of the oscillator. 

5.17 . * Re-derive the London dispersion force potential between two 3D hannonic oscillators 
(already calculated in Problem 3.19), using the language of mutually-induced polarization. 

5.18 . The discussion of the Glauber state properties in Sec. 5 has used the following general 
statement: if 

A,B 

where A and B are arbitrary operators, and // is an arbitrary c-number, then 

expj+ A js expj- h]= B + jul. 

Prove the statement. 

Hint : One (of several) ways to prove the statement is to expand operator 
f (A) = expj/Lljz? expj- AA } into the Taylor series in c-number A, and then evaluate it at A =1. 

5.19 . An external force pulse F(t), of a finite time duration T, is exerted on a ID harmonic 
oscillator, initially in its ground state. Use the Heisenberg-picture equations of motion to calculate the 
expectation value of oscillator’s energy at the end of the pulse. 

5.20 . Calculate the energy of the squeezed ground state ^ of a harmonic oscillator, defined by 
Eq. (172). 

5.21 . Use Eqs. (5.178) of the lecture notes to prove that operators L / and L 1 commute with the 
Hamiltonian of a spinless particle placed in any central potential field. 

5.22 . Prove the following relations for the operators of the angular momentum: 

L 2 = L] + L + L_ - TiL. = L] + L_L + + hL : . 
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(One of them, Eq. (184), was already used in Sec. 6.) 

5.23 . According to Eqs. (188) and their discussion, action of the ladder operators on the common 
eigenkets | /, m) of operators L 2 and L, may be described as 

L ± \l,m) = Lf ) \!,m±l). 

Calculate coefficients L± m \ assuming that the eigenstates are normalized: (/, m\l, m) = 1. 


5.24 . In the basis of common eigenstates of operators L, and Is , described by eigenkets \l, in): 


(i) calculate matrix elements [l,m l L x l,m 2 } and \l,m l 


L 2 


/, III 2 / , 


(ii) spell out your results for diagonal matrix elements (with m\ = tm) and their v-axis 
counterparts; and 


(iii) calculate diagonal matrix elements ( l,mL x L y l,m) and (l,m L v L x l,m) . 


5.25 . For the state described by the common eigenket | /, m) of operators L : and L 2 in a reference 
frame {x, y, z}, calculate the expectation values (L z ) and (L z - ) in the reference frame whose axis z’ 
forms angle 0 with axis z. 

5.26 . Write down the matrices of the following angular momentum operators: 
L x ,L y ,L z , and L ± , in the z-basis of states with / = 1 . 

521 . Find the angular part of the orbital wavefunction of a particle with a definite value of L , 
equal to 3 fi , and the largest possible value of L x . What is this value? 

5.28 / A charged 2D particle is trapped in a soft in-plane potential well U(x, y ) = moy 2 (x 2 +y 2 )/ 2. 
Calculate its energy spectrum in the presence of an additional uniform magnetic field normal to the 
plane. 


5.29 . Calculate the spectrum of rotational energies of an axially-symmetric, rigid molecule. 

5.30 . For the state with wavefunction y/= Cxye' J \ with a real, positive A, calculate: 

(i) the expectation values of observables L x , L y , L z and L , and 

(ii) the normalization constant C. 

5.31 . An angular state of a spinless particle is described by the following ket-vector: 

1 

Find the expectation values of the x- and v-components of its angular momentum. Is it sensitive to a 
possible phase shift between two component eigenkets? 



(| / = 3, m = 0) + / = 3, m = l)) . 
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5.32 . Simplify the following double commutator: A jf = [r.,[Z 2 ,r.,]|. 


5.33 . Express the commutators listed in Eq. (206), 


\J\L : 


and 


I J\s : 


, via L . and S , 


5.34 . Find the operator T r/i describing the state rotation by angle (f) about a certain axis, using the 

similarity of this operation with the shift of a Cartesian coordinate, discussed in Sec. 5. Then use it to 
calculate the probabilities of measurements of a beam of particles with z- polarized spin- 14, by a Stern- 
Gerlach instrument turned by angle 9 within the [z, x] plane (where y is the axis of particle propagation 
- see Fig. 4.1). 58 

5.35 . The rotation (“angle translation”) operators T . , analyzed in the previous problem, and the 
coordinate translation operator T x , discussed in Sec. 5.5 of the lecture notes, have a similar structure: 

r I ^ 

^= e xp|-,— 

where A is a real c-number, characterizing shift’s magnitude, and C is a Hermitian operator that does 
not explicitly depend on time. 

(i) Prove that all such operators T x are unitary. 

(ii) Prove that if the shift by A, induced by operator 77 , leaves the Hamiltonian of some system 
unchanged for any A, then the variable C, corresponding to the operator C , is a constant of motion. 

(iii) Discuss what does the last conclusion give for the particular operators T x and 73 . 

5.36 . A particle is in a state a with the orbital wavefunction proportional to the spherical 
hannonic Y^(9,cp). Find the angular dependence of the wavefunctions corresponding to the following 
ket-vectors: 

(i) L x \cc), (ii) L \a), (iii) L z \a), (iv) L + L_\a) , and (v)Z 2 |a). 

5.37 . For a state with definite quantum numbers / and j, prove that observable L-S also has a 
definite value, and calculate this value. 

5.38 . * Derive the general recurrence relations for the Clebsh-Gordan coefficients. 

Hint : Using the similarity of commutation relations, discussed in Sec. 7, generalize the solution 
of Problem 19 to all angular momentum operators, and apply them to Eq. (198). 

5.39 . The byproduct of the solution of the previous problem is the general relation for the spin 
operators (valid for any spin s), which may be rewritten as 


58 Note that the last task is just a particular case of Problem 4. 17 (see also Problem 1). 
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(m s \S ± | m s ± 1) = n[(s ± m s + 1 \s + m s )] 1/2 , 

provided that all other quantum numbers are fixed. Use this result to spell out the matrices S A , S v , S z , and 
S 2 of a particle with s = 1, in the z-basis - defined as the basis in which the matrix S z is diagonal. 

5.40 . For a particle with spin s, moving in a spherically-symmetric field, find the ranges of 
possible values of quantum numbers nij and j, necessary to describe, in the coupled representation basis: 

(i) all states with a definite quantum number /, and 

(ii) a state with definite value of not only /, but also mi and m s . 

Give an interpretation of your results in terms of the classical geometric vector diagram (see Fig. 1 1). 

5.41 . A spin- Vi particle moves in a centrally-symmetric potential U(r). Using Eqs. (216) for the 
Clebsch-Gordan coefficients, 

(i) write explicit expressions for the ket vectors for states that would be simultaneously the 
eigenstates of operators L 2 ,J 2 , and J, , via spin eigenkets |T) and \i); 

2 2 

(ii) for each such state, find all the possible values of observables L , L z , S~, and S z , the 
probability of each listed value, and the expectation value for each of the observables. 

5.42 . Taking into account electron’s spin, find the energy spectrum of an electron, free to move 
within a plane, besides being placed into a uniform magnetic field B, normal to the plane. Compare the 
result with the Landau level picture discussed in Sec. 3.2. 


Chapter 5 


Page 50 of 50 





Essential Graduate Physics 


QM: Quantum Mechanics 


Chapter 6. Perturbation Theories 

This chapter discusses several perturbative approaches to problems of quantum mechanics, and their 
simplest applications including the Stark effect, the fine structure of atomic levels, and the Zeeman 
effect. Moreover, the discussion of the perturbation theory of transitions to continuous spectrum and the 
Golden Rule of quantum mechanics in the end of this chapter will naturally bring us to the issue of open 
quantum systems - to be discussed in more detail in the next chapter. 


6.1. Eigenvalue/eigenstate problems 

Unfortunately, only a few problems of quantum mechanics may be solved exactly in the 
analytical form. Actually, in the previous chapters we have solved a substantial fraction of such 
problems for a single particle, and for multi-particle problems the exactly solvable cases are even more 
rare. However, most practical problems of physics feature a certain small parameter, and this smallness 
may be exploited by various approximate analytical methods. Earlier in the course, we have explored 
one of them, the WKB approximation, which is adequate for a particle moving through a slowly 
changing potential profile. Now I will discuss alternative approaches that are more suitable for other 
cases. The historic name for these approaches is the perturbation theory, but it is more fair to speak 
about several such theories, because they differ depending on the type of the problem. 

The simplest perturbation theories address eigenproblems for systems described by time- 
independent Hamiltonians of the type 

H = H m +H (l \ (6.1a) 


where the perturbation operator H {X) is “small” - in the sense its addition to the unperturbed operator 
// (0) results in a relatively small change of eigenvalues E n of the system. A typical problem of this type 
is the ID weakly anharmonic oscillator (Fig. 1) described by Hamiltonian (la) with 

Weakly 
anharmonic 
oscillator 

with small coefficients a, J3, .... 

I will use the anharmonic oscillator as our first particular example, but let me start from 
describing the perturbative approach to the general time-independent Hamiltonian (la). In the bra-ket 
formalism, the eigenproblem for the perturbed system is 

(H m +H m )\n) = E n \n). (6.2) 



(6.1b) 


Let the eigenstates and eigenvalues of the unperturbed Hamiltonian, which satisfy equation 


H 


( 0 ) 


(0) \ _ p(0) 


,(°) 


(6.3) 


be known. In this case, to solve problem (2) means to find, first, its perturbed eigenvalues E n and, 
second, coefficients (n ,(0) | n) of the expansion of perturbed state vectors | n) in series over the unperturbed 
ones, | n ,(0) >: 
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=£1 


MO) 


MO) 

n ’ n 


(6.4) 



+ H W 


Fig. 6.1. The simplest problem for the 
perturbation theory application: a ID 
weakly anharmonic oscillator. (Dashed 
lines characterize the unperturbed, 
harmonic oscillator.) 


Let us plug Eq. (4), with the summation index n ’ replaced with n ”, into both parts of Eq. (2): 


X (n " (0) | n) H (0] 1 1 n ' " (0) ) + £ (« ' " (0) | n) H (l) \ n " (0) ) = £ (n " (0) 


n )E„ n 


»(0) 


(6.5) 


and then inner-multiply all terms by an arbitrary unperturbed ket-vector (n Assuming that the system 
of unperturbed eigenstates is orthonormal, (n M)) \n ” <0) ) = 8 n •„ and using Eq. (3) in the first term of the 
left-hand part, we get the following system of linear equations 





(6.6) 


where the matrix elements of the perturbation are calculated in unperturbed bra-kets: 


H 


(i) 

n'n" 





(6.7) 


The linear equation system (6) is still exact, 1 and is frequently used for numerical calculations. 
(Since the matrix coefficients (7) typically decrease when n ’ and/or n ” become very large, the sum in 
the left-hand part of Eq. (6) may be typically truncated, still giving acceptable accuracy of the solution.) 
For getting analytical results we need to make more explicit approximations. In the simple perturbation 
theory we are discussing now, this is achieved by the expansion of both eigenenergies and coefficients 
into the Taylor series in a certain small parameter p of the problem: 


E„ = + E [ y + E 


a) 


7 ( 2 ) 


(6.8) 


MO) 





+ ( n 


,M0) 



+ In 


,rr( 0) 



(6.9) 


where 2 


Ak) 


oc in 


. MO) 


I \« 

I") 


OC 




( 6 . 10 ) 


1 Please note its similarity with Eq. (2.215) of the ID band theory. Indeed, the latter equation is not much more 
than a particular form of Eq. (6) for ID wave mechanics, and a specific (periodic) potential U(x ) considered as 
perturbation. Moreover, the approximate treatment of the weak potential limit in Sec. 2.7 was essentially a 
particular case of the more general perturbation theory we are discussing now. 

2 Note that, by definition, (n ,(0) | «) (0) = 8 n n . 


Perturbation's 

matrix 

elements 
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In order to explore the 1 st -order approximation, which ignores all terms 0(/u) and higher, let us 
plug only the two first terms of expansions (8) and (9) into the basic system of equations (6): 


Y, HZ{ S n ’; 7 + (n" m \n) m ^-( * x/»'( 0 >L\ (1, Vf(°) j. k-P) _ f(0)' 


= \S*+ \{E™+E?>-E™). 


( 6 . 11 ) 


Now let us open the parentheses, and disregard all the remaining terms The result is 






r(0) 


0 ) 


Z7(°E 


( 6 . 12 ) 


This equation is valid for any set of indices n and n let us start from the case n = n ’ and immediately 
get a very simple (and the most important!) result: 


1 st -order 
correction 
of 

energies 


E™ =m'J =ln w \H w \n 


0) _ /„(0) rV (1) „ (0) 


(6.13) 


For example, let us see what does this result give for two first perturbation terms in the weakly 
anharmonic oscillator (lb) 


E^ =a{n m \xV ) ) + P(n m \?\n 


„(0) b3 l (0) 


,(0) -=.41 (0) 


(6.14) 


As the reader should know from the solution of Problem 5.6, the first tenn is zero, while the second one 
yields 3 

E { - ] = | (hi (in 2 + In + 1) . (6.15) 


Naturally, there should be some contribution from the (typically, larger) term proportional to a, 
so we need to explore the 2 nd approximation of the perturbation theory. However, before doing that, let 
us complete our discussion of its 1 st order. For n ’ ^ n, Eq. (12) may be used to calculate the eigenstates 
rather than the eigenvalues: 


,f(0) 


i,) 8 . g - ; - 

I / £«» _£<«) 


for n ' ^ n. 


This means that the eigenket’s expansion (4), in the 1 st order, may be represented as 

1 st - order 
result 
for 
vectors 



(6.16) 


(6.17) 


Coefficient C cannot be found from Eq. (12), however, requiring the final state n to be normalized, we 
see that other terms may provide only corrections 0(/T), so in the 1 st order we should take C = 1. The 


3 The result for n = 0 may be readily calculated in the wave-mechanics style as well, using Eq. (2.269) for 
unperturbed ground state wavefunction, and the table integral MA Eq. (6.9d): 


,( 0 ) 


X 


■"L-JtoO x A yf$'dx = 



1/2 

f 2 "I 

f 4 

X 


X exp 



l 

1 x oJ 



but for higher values of n, such calculations are much harder, because of more involved Eq. (2.279) for y/,^\ Note 
also that at n » 1, Eq. (15) gives predictions which coincide with those of the classical theory of weakly 
nonlinear oscillations - see, e.g., CM Sec. 4.2, in particular, Eq. (4.49). 
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most important feature of Eq. (17) is its denominator: the closer the (unperturbed) eigenenergies of two 
states, the larger is their mutual contribution ( hybridization ), created by the perturbation. 

This feature also affects the 1 st approximation’s validity condition that may be quantified using 
Eq. (16): the magnitudes of all the bra-kets it describes have to be much less then the unperturbed 
product («|«) (0) = 1, so that all elements of the perturbation matrix have to be much less that the 
difference between the corresponding unperturbed energies. For the anharmonic oscillator’s energy 
correction (15), this requirement is reduced to E„ il) « hcoo. 

Now we are ready for going after the 2 nd second order approximation to Eq. (6). Let us focus on 
the case n ’ = n, because as we already know, only this term will give us a correction to eigenenergies. 
Moreover, we see that since the left-hand side of Eq. (6) already has the small factor //' °c ju, the 
bra-ket coefficients in that part may be taken from the 1 st order result (16). As a result, we get 




,»( o) 


\ (1) 

V 


= I 


n n nn 

£(0) _ ^r(O) ' 


(6.18) 


Since H (V) represents an observable (energy), and hence has to be Hermitian, we may rewrite this 
expression as 


E (2) = 


2/ 



2 

o' 


n (0) ) 


- V 



/ 


n'^n-^n ^ ri 


( 0 ) 


E (0) - E ^ 


( 0 ) 


(6.19) 


This is the much celebrated 2 nd order perturbation result that frequently (in sufficiently 
symmetric problems) is the first nonvanishing correction to the state energy - for example, from the 
cubic tenn (proportional to a) in our weakly anharmonic oscillator problem (1). In order to calculate the 
corresponding correction, we may use another result of Problem 5.6: 


( X 


f I * 3 I . 

n \x \n) = 


V V2 y 


(6.20) 


x {n(n - 1 )(n - 2)] ul S n ,,_, + 3 n v2 S,,^ + 3 (n + 1 ) v2 S„, ntl + [(» + l)(n + 2 )(« + 3)] ,,J <?„, „,}. 


So, according to Eq. (19), we need to calculate 


E^=a : 




V V2y 


z 


- 1)0 - 2jT + 3 + 3(., + 1) 1 ' V, + l(n + 1)(„ + 2 )(„ + 3)]' ,2 <?„,„ i 


(6.21) 


fico Q {n-n') 


The summation is actually not as cumbersome as may look, because all mixed products are proportional 
to different Kronecker deltas and hence vanish, so that we need to sum up only the squares of each term: 


A (2) = 


a 


fico n 




n{n-l){n-2) | 9 n 3 | 9(n + l) 3 (n + 1 )(n + 2 ){n + 3) 


-1 


-3 


15 a 2 Xg f 


( 6 . 22 ) 


4 ft co, 


n +n + 


'o V 
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Please notice that all energy level corrections are negative, regardless of the sign of a. On the 
contrary, the 1 st order correction Ej 1 1 (15) depends on the sign of parameter / 3 . so that the net correction, 
E,f } + E n (2 \ may be of any sign. 

Results (17) and (19) are clearly inapplicable to the degenerate case where, in the absence of 
perturbation, several states correspond to the same energy level, because of the divergence of their 
denominators. 4 This divergence hints that the largest effect of the perturbation in that case is the 
degeneracy lifting, e.g., splitting of the initially degenerate energy level E {0} (Fig. 2), and that for the 
analysis of this case we can, to the first approximation, ignore the effect of all other energy levels. (A 
careful analysis shows that this is indeed the case until the level splitting becomes comparable with the 
distance to other energy levels.) 


E (0) 




H = H {0) 



E i 
E 2 


H = H m + H {1) 


Fig. 5.2. Lifting the energy 
level degeneracy by a 
perturbation (schematically). 


Limiting the summation in Eq. (6) to the group of N degenerate states with equal E n ■ (0) = £(()), 
we reduce it to 


£(n" (0) \n)H^„ = (n' (0) \n)(E n -E w ) . (6.23) 

n "= 1 

where n ,(0) and n ” (0) number N states of the degenerate group. 5 Equation (23) may be rewritten as 

-4>tf, s .)=0, where (6.24) 

n "= 1 


For each n = 1, 2, ...N, this is a system of N linear, homogenous equations (with N terms each) for 
u nkn own coefficients (n”^ ] \n). In this problem, we readily recognize the problem of diagonalization of 
the perturbation matrix H (1) - cf. Sec. 4.4 and in particular Eq. (4.101). As in the general case, in the 
condition of self-consistency of the system, we can change the notation of the lower index of E { 1 , for 
example to n: 


Energy 
levels 
of initially 
degenerate 
system 



nffl _ fW 

12 U n 

H {1) 
11 12 



H il) 
11 21 

ZJrO) _ E-(l) 
n 22 

o 

II 


(6.25) 


4 This is exactly the reason why such perturbation theories run into serious problems for systems with continuous 
spectrum, and other approximate techniques (such as the WKB approximation) are often necessary. 

5 Note that the choice of the basis is to some extent arbitrary, because due to the linearity of equations of quantum 
mechanics, any linear combination of states n ” <0) is also an eigenstate of the unperturbed Hamiltonian. However, 
for using Eq. (24), these combinations have to be orthonormal, as was suggested at the derivation of Eq. (6). 
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According to the definition (24) of E n (]) , the resulting N energy levels E n may be found as E (0) + E n (1) , 
where E n {l) are the N roots of Eq. (25). 

If the perturbation matrix is diagonal, the result is extremely simple, 

E n - E i(>> = E ( n l) = H ( n 'J , (6.26) 


and formally coincides with Eq. (13) for the non-degenerate case, but now may give a different result for 
each of N previously degenerate states n. 


Let us see what does this theory give for several important examples. First of all, let us consider 
a two-level system (or a system with two degenerate states with energy far from all others levels), with 
an arbitrary perturbation matrix 6 


H 


(i) _ 


A, hE 

\H 2 1 H 22 J 


(6.27a) 


Since that both the unperturbed Hamiltonian and the operator of its perturbation are Hermitian, the 
diagonal elements of matrix H (1) are real, and its off-diagonal elements are complex conjugates of each 
other. As a result, we can present the matrix in the same form as in Eq. (4. 106): 


H (1) = 


a 0 +a z 

v«, +ia y 


a x —ia y 

a o ~ a z j 


= a 0 l + a x c x + a y a y + a z a z = a 0 l + a • o. 


(6.27b) 


where scalar ao and the Cartesian components of vector a are real c-number coefficients. The 
corresponding characteristic equation, 


«0 + «z ~ E n ] 

a r +ia v 

jl y 


a x —ia y 
a 0 —a z -E 


(i) 


= 0, 


(6.28) 


has the solution that is familiar to the reader from Chapters 2 and 4: 


E ^ =E ± - E (V> = a 0 ±a = a 0 ± \a x + a y + a\ 


’( 0 ) 


1/2 


+H 


22 


'H u -H ' 2 


nl/2 


■ 22 


+ H n H lx 


.(6.29) 


Let us discuss physics of this simple result. Parameter ao = (//i i + is evidently the 

correction to the average energy of both states, that does not give any contribution to the level splitting. 
The splitting, A E = E+ - E., is a hyperbolic function of coefficient a : = (Hu - H22)/2 that describes the 
direct contributions (13) to the eigenstates due to the perturbation. A plot of this function is the famous 
level-anticrossing diagram (Fig. 3) that has already been discussed in Sec. 2.5 in a particular context of 
the weak potential limit of the ID band theory - see Fig. 2.29. 

Now we see that this is a general result for any two-level system. The examples of this behavior 
that we already know include the coupled quantum wells (see Fig. 2.29 and its discussion), band theory 
in the weak coupling limit (Sec. 2.5), and spin- 'A systems discussed through Chapter 4 and in Sec. 5.1. 
By the way, from Sec. 4.4 we already know the perturbed states in the middle of the anticrossing 


6 For brevity, I am dropping the upper index (1) in the matrix elements. 
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diagram (at a z = 0). For example, if a y = 0, then our perturbation Flamiltonian matrix (27), besides the 
trivial term proportional to ao, is proportional to a r , and hence we can use the result (4.1 14) to write: 7 


±) = -F| i <«>>±| 2 >»>». 


(6.30) 


where 1 <0) and 2 (0) are system’s states in the absence of the perturbation. 



Fig. 6.3. Level-anticrossing diagram for an 
arbitrary two-level system. 


This analysis shows that other results of our discussions of particular two-level systems in Sec. 
2.6 and 4.6 are also general. For example, if we put such any two-level system into an initial state 
different from one of the eigenstates ±, the probability of its finding it in any of states 1 <0) or 2 (0) will 
oscillate with frequency 


A E _E + -E 

~h~ V~ 


(6.31) 


Hence, for a spin -Vi particle in a z-oriented magnetic field, the periodic oscillations of the x- and y- 
components of spin vector, described by Eqs. (4.196) and (4.202), may be interpreted not only as the 
torque-induced precession of spin within the [x, y] plane, but alternatively as the quantum oscillations of 
the of the z-component of spin between states T and -l with energies E\ and E{ given by Eq. (4. 167). 

Some other examples of such oscillations may be rather unexpected. For example, the 
ammonium molecule NH3 (Fig. 4) has two symmetric states which differ by the inversion of the 
nitrogen atom relative to the plane of the three hydrogen atoms, and are coupled due to quantum- 
mechanical tunneling of the nitrogen atom through the plane of hydrogen atoms. 8 Since for this 
molecule, the level splitting AE corresponds to an experimentally convenient frequency Q./2n» 24 GHz, 
it played an important historic role for the initial development of first atomic frequency standards and 
microwave quantum generators ( masers ) in the 1950s, 9 which paved the way toward the development of 
the laser technology. 


7 Alternatively, if a x = 0, then |±) = (1/V2)( |1 (0) ) ±i|2 <0) )). Note that besides a phase coefficient, these states are 
similar in that they present a coherent superposition of the unperturbed states, with a 50/50 chance to find the 
perturbed system in any of those states. In that sense, the effects of perturbation coefficients a x and a Y are similar. 

8 Since the hydrogen atoms are much lighter, it is more fair to speak about their correlated tunneling around the 
(nearly immobile) nitrogen atom. 

9 In particular, these molecules were used in the demonstration of the first maser by C. Townes’ group in 1954. 
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N 



H 


Fig. 6.4. Ammonia molecule and its inversion. 


6.2. The Stark effect 


Another example of the level degeneracy lifted by a perturbation is the linear Stark effect - 
atomic level splitting by an external electric field. Let us study this effect, in the linear approximation, 
for a hydrogen-like atom. Taking the direction of external electric field 3 (which is practically uniform 
on the atomic scale) for the z-axis, the perturbation may be represented by the following Hamiltonian: 10 


H { ' ] = -q&z = -q&r cos 6 . 


(6.32) 


(Since we will work in the coordinate representation, we may skip the operator sign from this point on.) 


As you (should :-) remember, energy levels of a hydrogen-like atom depend only on the main 
quantum number n - see Eq. (3.191); hence all states but the ground state n = 1 (“Is” in the 
spectroscopic nomenclature) in which / = m = 0, have some degeneracy that grows rapidly with n. This 
is why I will carry out the calculations only for the lowest degenerate level with n = 2. Since generally 0 
< l <n - 1, here I may be equal either 0 (one 2s state, with in = 0) or I (three 2 p states, with m = 0, ±1). 
Due to this 4-fold degeneracy, H (1) is a 4x4 matrix with 16 elements: 


/=0 / = 1 
m = 0 m = 0 m = + 1 m = - 1 


'#11 

H n 

#13 

#14 

X 

m 

= 0 , 

1 = 0 

#21 

h 21 

#23 

#24 


m 

= 0, 


#31 

#32 

#33 

#34 


m 

= +1, 

’1 = 1 

k 

#42 

#43 

#44 

9 

m 

= -l, . 



(6.33) 


However, please do not be scared. First, due to the Hermitian character of the operator, only 10 
of the matrix elements (4 diagonal ones and 6 off-diagonal elements) may be substantially different. 
Moreover, due to a high symmetry of the problem, there are a lot of zeros even among these elements. 
Indeed, let us have a look at the angular components Yf of the corresponding wavefunctions, described 
by Eqs. (3.174)-(3.175). For states with m = ±1, the azimuthal parts of wavefunctions are proportional to 
cxp !±/7/d ; hence the off-diagonal elements H 34 and T/43 of matrix (33), relating these functions, are 
proportional to 


10 If there is any doubt why, please revisit the discussion of Eq. (2.247), in which we should now take F = q&. 


Stark 

effect’s 

perturbation 
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§dDY?*H m Y? oc JcftpjV i<p ] j (V i<p '^ = 0. (6.34) 

The azimuthal-angle symmetry also kills the off-diagonal elements //13, //14, /i/23, H24 (and hence 
their complex conjugates /fti, H 41 , H 32 , and H 42 ), because they relate states with m = 0 and m ^ 0, and 
are proportional to 

2 K 

§d£lYfH m Y x ±l oc \dcpe ± Up = 0. (6.35) 

0 


For the diagonal elements H 33 and H 44 , corresponding to in = ±1, the azimuthal-angle integral 
does not vanish, but since the spherical functions depend on the polar angle as sin ft the matrix elements 
are proportional to 

K +1 

H m Y x cc J sin ft/ft sin ft cos ft sin <9 = Jcosft(l-cos 2 ft)<i(cosft), (6.36) 

0 -1 

i.e. are equal to zero as any limit-symmetric integral of an odd function. Finally, for states 2s and 2 p 
with m = 0, the diagonal elements Hn and II 22 are also killed by the polar-angle integration: 

* n 1 

j)<7Q7 0 ° i/ (1) 7 0 ° oc ^smOdOcosO = Jcos0 <i(cos0) = 0, (6.37a) 

o -1 

71 +1 

^dYlY^ H [ 1 ) Yq cc J sinftr/ftcos 3 ft = J cos 3 6 <7 (cos ft) = 0. (6.37b) 

o -1 

Hence, the only nonvanishing matrix elements are two off-diagonal elements Hn and Z /21 relating 
different states with m = 0, because they are proportional to 

^ HZ 2k n 1 

<fr/Q7 0 ° cost?! 7 , 0 = — J d(p^ sinft/ftcos 2 0 = —j= ^ 0. (6.38) 

4^ 0 0 V 3 

What remains is to use Eqs. (3.199) for the radial parts of these functions to finish the calculation of 
those two matrix elements: 


H n = H 2X =-^\r 2 dr^ 20 (r)r^ 2 X {r), 

V3 0 


(6.39) 


where the radial functions are given by Eqs. (3.199). Due to the structure of function '< 2 ,oft), the integral 
falls into a sum of two parts, both of the type we have already met. 11 The final result is 

H n =H 2l =3 q#r 0 , (6.40) 

where r 0 is the radius scale given by Eq. (3.183); for the hydrogen atom it is just the Bohr radius r B 
(1.13). 

Thus, for our case the perturbation matrix (33) is reduced to 


11 See, e.g., MA Eq. (6.7d). 
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H (1) = 


0 

3 q&o 

0 

0 


so that the condition (25) of self-consistency is 

-4° s «*■, 


3 q&o ~ E n 

0 0 

0 0 


3 q#r 0 

0 

0 

0 

0 

0 

- 

n 

0 


0 

0 0 
0 0 
0 0 j 

0 

0 

0 

-E (l) 


= 0 , 


giving a very simple characteristic equation 

( 4 ' , ) 1 ( 4 1) ) ! -(3 9*'.) 2 


= 0 . 


(6.41) 


(6.42) 


(6.43) 


with the roots 



= 0, 



= ±3 q#r 0 . 


so that the degeneracy is only partly lifted - see Fig. 5. 


(6.44) 


Linear 
Stark 
effect 
for n = 2 


£) 0) 


+ = 


-j=(\2s) + \2p)) 




3q# r o 


-j=(\ 2s )-\ 2 p}) 


m = 0 
m = ±1 
m = 0 


Fig. 6.5. Linear Stark effect for level n 
= 2 of a hydrogen-like atom. 


Generally, in order to understand the nature of states corresponding to these levels, we should go 
back to Eq. (24) with each calculated value of , and calculate the corresponding expansion 
coefficients (n ” (0) | n), which describe the perturbed states. However, in our simple case the outcome of 
the procedure is clear in advance. Indeed, since the states with m = ± 1 are not affected by the 
perturbation (in the linear approximation in electric field), their degeneracy is not lifted, and energy 
unaffected - see the middle level in Fig. 5. On the other hand, the perturbation matrix connecting states 
2s and 2 p, i.e. the top left 2x2 part of the full matrix (41), is proportional to the Pauli matrix a r , and we 
already kn ow the result of its diagonalization - see Eqs. (4.114). This means that the upper and lower 
split levels correspond to very simple linear combinations of the previously degenerate states, 

l±) = (6.45) 

both with m = 0. 
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Finally, let us estimate the magnitude of the linear Stark effect for a hydrogen atom. For a very 

high electric field of <?= 3xl0 6 V/m, 12 q = e ~ 1.6xl0' 19 C, and r 0 = r B « 0.5xl0' 10 m, we get a level 

22 

splitting of 3 ~ 0.8x10' J ~ 0.5 meV. This number is much lower than the unperturbed energy of 

the level, E 2 = -E\\/2x2 2 « -3.4 eV, so that the perturbation result is quite valid. On the other hand, the 
splitting is much larger than the resolution limit imposed by the natural linewidth (~ 10' 7 E 2 , see Chapter 
9), so that the effect is quite observable even in substantially lower electric fields. 


6.3. Fine structure of atomic levels 


Now let us analyze, for the simplest case of a hydrogen-like atom, the so-called fine structure of 
atomic levels - their degeneracy lifting even in the absence of external fields. In the limit when the 
effective speed v of electron motion is much smaller than the speed of light c (as it is in the hydrogen 
atom), the fine structure may be analyzed as a sum of two small relativistic effects. To analyze the first 
of these effects, let us expand the well-known classical relativistic expression 13 for the kinetic energy T 
= E - me of a free particle with the rest mass m. 


T = {‘ 


2 4 


2 2 


\l/2 


= [m c + p c I -me = me 


2 A 


1/2 


1 + - 


2 2 

v m c j 


-1 


(6.46) 


2 2 

into the Taylor series with respect to the small ratio (p/mc ) « (v/c) : 


T = me 2 


, + i 

2 


+ ...-1 


\mc 


\ me 


Pfi 

2m 


8m c 


3„2 


+ ..., 


(6.47) 


and neglect all the terms besides the first (non-relativistic) one and the next term representing the first 
nonvanishing relativistic correction of T. 

In accordance with the correspondence principle, the quantum-mechanical problem in this 
approximation may be described by the perturbative Hamiltonian (la), where the unperturbed (non- 
relativistic) Hamiltonian of the problem, whose eigenstates and eigenenergies were discussed in Sec. 
3.5, is 

H (0) =L— + U(r), U(r) = — — , (6.48) 

2 m r 


while the small kinetic-relativistic perturbation is 


Kinetic- 

relativistic 

perturbation 


i 

i 

( -2 S 

P 

2 

8m 3 c 2 2 me 1 

V 2 m) 



(6.49a) 


Using Eq. (48), we may rewrite the last fonnula as 


12 This value approximately corresponds to the threshold of electric breakdown in air, due to the impact ionization 
on the surface of typical metallic electrodes. (Reducing air pressure only enhances the ionization and lowers the 
breakdown threshold.) As a result, experiments with higher fields are rather difficult. 

13 See, e.g., EM Sec. 9.3, in particular Eq. (9.78) - or any undergraduate text on special relativity. 
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H (1) = [ -^(h (0) - U (r)) 2 , (6.49b) 

2 me 

so that its matrix elements, participating in the characteristic equation (25) for a given degenerate energy 
level (3.191), i.e. a given principal quantum number n, are 

(; nlm\H m \nl’m ') = l —{nlm\{H m -t/(r))(f/ (0) -U(r))\nVm ') , (6.50) 

2 me 

where the bra- and ket vectors describe the unperturbed eigenstates whose eigenfunctions (in the 
coordinate representation) are given by Eq. (3.190): i//„,/, m = 'Kn,i(r)Yi'(Ojp). 

It is straightforward (and hence left for the reader :-) to prove that all off-diagonal elements of 
the set (50) are equal to 0. Thus we may use Eq. (26) for each set of quantum numbers {n, /, m } : 


E m , =E , 

n,l,m n,l,m 


- E^ ] = (nlm\H {1) \nlm) 




n,l,m 



2 EAU 


n.l 






(6.51) 


where index m has been dropped, because the radial wavefunctions 'K, h /(r ), which affect the averages, do 
not depend on that quantum number. Now using Eqs. (3.183), (3.191) and the first two of Eqs. (3.201), 
we finally get 

Kinetic- 
relativistic 
(6.52) energy 

correction 

Let us discuss this result. First of all, its last form confirms that that correction (52) is indeed 
much smaller than the unperturbed energy E n (and hence the perturbation theory is solid) if the latter is 
much smaller than the relativistic rest energy me of the particle. Next, since in the Bohr problem n> l + 

1, the first fraction in the parentheses of Eq. (52) is always larger than 1, so that the relativistic 
correction to kinetic energy is negative for all n and /. (This is already evident from Eqs. (6.49), which 
show that the correction Hamiltonian is a negatively defined form.) Finally, at a fixed principal number 
n, the negative correction’s magnitude decreases with the growth of /. This fact may be classically 
interpreted using Eq. (3.200): the larger is / (at fixed n), the smaller is particle’s average distance from 
the center, and hence the smaller is its effective velocity, the smaller is the magnitude of the quantum- 
mechanical average of the negative relativistic correction (49a) to the kinetic energy. 

Result (52) is conceptually valid for any physics of interaction U(r) = -C!r. However, if the 
interaction is Coulombic, say between an electron with charge (-e) and a nucleus of charge (+Ze), there 
is also another relativistic correction to energy, due to the so-called spin-orbit interaction. Its physics 
may be understood from the following semi-qualitative, classical reasoning: from the “the point of 
view” of an electron rotating about the nucleus at constant distance r with velocity v, it is the nucleus, of 
charge Ze, that rotates about the electron with velocity (-v) and hence time period T = 2n r/v. From the 
point of view of magnetostatics, such circular motion of electric charge Q = Ze is equivalent to the 


(i) _ mC 2 

f n 3) 

2 El 

f n 23 l 


^nj -.4.2 2 4 

2n c n 

L/ + 1/2 4 j 

2 

me 

<N 

+ 


Chapter 6 


Page 12 of 40 


Essential Graduate Physics 


QM: Quantum Mechanics 


constant circular electric current / = Qv = {Ze){vl2nr) which creates, at electron’s location, i.e. in the 
center of the current loop, a magnetic field with magnitude 14 


Mo , A, Zev ... MyZev 

2 r 2 r 2 4nr 2 


(6.53) 


The field’s direction n is perpendicular to the apparent plane of the nucleus’ rotation (i.e. that of the real 
rotation of the electron), and hence its vector may be readily expressed via the similarly directed vector 
L of electron’s angular (orbital) momentum: 


3 


a 


u,,Zev u,..Ze u,,Ze Ze 

2 n = — — m e vrn = — ^ — L = -L , 

4 w 4 tu~ m e 4 7W m e 47i£ 0 rm e c 


(6.54) 


2 

where the last transition is due to the basic relation between the SI unit constants: So/uo = c . 

A more careful (but still classical) analysis of the problem 15 brings both good and bad news. The 
bad news is that result (54) is wrong by a factor of 2 even for the circular motion, because the electron 
moves with acceleration, and the reference frame bound to its cannot be considered inertial (as was 
implied in the above reasoning), so that the actual magnetic field felt by the electron is 


3 = 


Ze 

8 7T£ 0 r 2 m e c 2 


L. 


(6.55) 


The good news is that, so corrected, the result is valid (on the average) for not only circular but 
arbitrary (elliptic 16 ) orbital motion in the Coulomb field U(r). Hence from the discussion in Sec. 4.1 and 
Sec. 4.4 we may expect that the quantum-mechanical description of the interaction between this 
apparent magnetic field and electron’s spin moment (4.116) is given by the following perturbation 
Hamiltonian 


H m =-£.» = - 


A A f 

e. ' 

- — S 


V 


m„ 




Ze 


%7i£ 0 r m e c 


1 Ze 2 1 

2 m 2 c 2 4 k£ ( , r 3 


S L, 


(6.56a) 


Spin _ where the small correction to value g e = 2 of electron’s g-factor has been ignored, because Eq. (56) is 
orbit already a small correction. This expression is confirmed by the fully-relativistic Dirac theory, to be 
perturbation ^ scusse( j g ec 9 7 b e i ow; it yields, for an arbitrary central potential U(r), the following Hamiltonian 
of the spin-orbit coupling: 


H^ = 


1 1 dU{r) g £ 

2 m 2 c 2 r dr 


2 

For the Coulomb potential U(r) = -Ze 14 n£§r, this formula is reduced to Eq. (56a). 


(6.56b) 


As we already know from the discussion in Sec. 5.7, such Hamiltonian commutes with all 
operators diagonal in the coupled representation (inside the blue line in Fig. 5.10): L 2 , S 2 , J 2 , and J z . 
Hence, using Eq. (5.208) to rewrite the spin-orbit Hamiltonian as 


14 See, e.g., EM Sec. 5.1, in particular, Eq. (5.24). 

15 See, e.g., R. Harr and L. Curtis, Am. J. Phys. 55, 1044 (1987). 

16 See, e.g., CM Sec. 3.6. 
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H 


(i) _ 


1 


Ze 2 1 1 


2mm 2 c 2 4 ns. 


\J 2 -L -S 


4 


(6.57) 


we may conclude that this operator is diagonal in the coupled representation with fixed quantum 
numbers l, s,j, and in,. As a result, in this representation, we may again use Eq. (26) for each set {/,/, 
ny}: 



1 Ze 2 
2mm 2 c 2 4 




-L 1 



(6.58) 


where the indices irrelevant for each particular term have been dropped. (As a reminder, the spin 
quantum number s is fixed by particle’s nature; for our case of an electron, s = Vi.) Now using the last of 
Eqs. (3.201), and similar expressions (5.192), (5.197), and (5.203), we get an explicit expression for the 
spin-orbit corrections 17 

Spin- 

(6.59) orbit 

v ' energy 

correction 

The last form of its right-hand part shows very clearly that this correction has the same scale as 
the kinetic correction (52), 18 so that they should be considered together. In the first order of the 
perturbation they may be just added, giving a very simple formula for the net fine structure of level n: 

Fine 

structure 

(6.60) °f H-like 

atom’s 
levels 

This simplicity, as well as the independence of the result of the orbital quantum number /, will become 
less surprising when (in Sec. 9.7) we see that this fonnula follows in one shot from the Dirac theory, in 
which the Bohr atom’s energy spectrum in numbered only with n and j, but not /. 

Let us recall (see Sec. 5.7) that for an electron (5 = Vi), the quantum number j may take n positive 
half-integer values, from Vi to n - Vi. With the account of this fact, Eq. (60) shows that the fine structure 
of n th Bohr’s energy level has n sub-levels - see Fig. 6. 


E 2 

Z7<1) _ n n 

L 4 n ^ 


^ fine - 2 

2 m e c 

l 7 + 1/2; 



,(D 1 Ze 2 tt 2 j{j + !) — /(/ + 1)— 3 / 4 _ E 2 j(j + 1) - /(/ + 1) - 3/ 4 

nJJ 2m]c 2 4 tt£ 0 2r 3 n 3 l(l + l/2)(/ + 1) m e c 2 /(/ + 1/2)(/ + 1) 



j = n- 1/2 

7 = 5/2 
7=3/2 


7=1/2 


Fig. 6.6. Fine structure of a 
hydrogen-like atom’s level. 


17 The factor / in the denominator does not give a divergence at / = 0, because in this case j = s = Vi, and the 
nominator turns into 0 as well. A careful analysis of this case (which may be found, e.g., in G. K. Woolgate, 
Elementary Atomic Structure, 2 nd ed., Oxford, 1983), as well as the exact solution of the Bohr atom problem 
within the Dirac theory (Chapter 9) show that the final result (60), which is independent of /, is valid even in this 
case. 

18 This is natural, because the magnetic interaction of charged particles is an essentially relativistic effect, of the 
same order (~v 2 /c 2 ) as the kinetic correction (49a) - see, e.g., EM Sec. 5.1, in particular Eq. (5.3). 
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Please note that according to Eq. (5.203), each of these sub-levels is still (2 \j + l)-times 
degenerate in quantum number m.j. This degeneracy is very natural, because in the absence of external 
field the system is still isotropic. Moreover, on each fine-structure level, besides the lowest (j = 'A) and 
the highest (j = n — V 2 ) ones, each of the /M-statcs is doubly-degenerate in the orbital quantum number / = 
j + V 2 - see the labels of / in Fig. 6. (According to Eq. (5.215), each of these states, with fixed j and m h 
may be represented as a linear combination of two states with adjacent values of /, and hence different 
electron spin orientations, m s = ±'A, weighed with the Clebsch-Gordan coefficients.) 

These details aside, one may crudely say that the relativistic corrections make the total 
eigenenergy to grow with 1, contributing to the effect already mentioned at our analysis of the periodic 
table of elements in Sec. 3.7. The relative scale of this increase may be evaluated from the largest 
deviation from the unperturbed energy E n , reached for the state with j = Vi (and hence / = 0): 


E a) 

max 

_ E n 

2 n - 

3^ 


( Ze 2 

2 

ri 

3 'i 

= Z 2 a 2 

A 

3 1 

E n 

m e c 2 

2; 


(4 nefhc J 


U 

4 n 2 j 

u 

An 2 ) 


where a is the fine structure (“Sommerfeld”) constant, 

(A 1 

a = « , (6.62) 

4 nejic 137 

that was already mentioned in Sec. 4.4. 19 These expressions show that the fine structure is indeed a 
relatively small correction (~a ) for the hydrogen atom, but it rapidly grows (as Z ) with the nuclear 
charge (atomic number), and becomes rather substantial for the heaviest atoms with Z~ 100. 


6.4, The Zeeman effect 

Now, we are ready to review the Zeeman effect - the lifting of atomic level degeneracy by an 
external magnetic field. 20 Using Eq. (3.26) (with q = -e) for the description of electron’s orbital motion 
in the field, and Eq. (4.116) for the operator of electron’s magnetic moment due to its spin-Vi, we see 
that even for a hydrogen-like (i.e. single-electron) atom, neglecting the relativistic effects, the full 
Hamiltonian is rather bulky: 

H = — (p + eA ) 2 — — 3S. (6.63) 

2 m e 4 7t£ 0 r m e 

There are several simplifications we may make. First, let us assume that the external field is 
spatial-uniform on the atomic scale (which is a very good approximation for most cases), so that we can 
take the vector-potential in an axially-symmetric gauge - cf. Eq. (3.132): 


19 See the Selected Physical Constants appendix for the more exact value of this constant. Its expression in 
Gaussian units, a = e 2 /fic, makes even more evident the fact that a is the just fundamental constant ratio which 
characterizes the strength (or rather the weakness :-) of electromagnetic effects in quantum mechanics - that in 
particular makes the perturbative quantum electrodynamics possible. The alternative expression a = Efm e c 2 , 
where E H is the Hartree energy (1.9), the scale of all E„, is also very revealing. 

20 It was discovered experimentally in 1896 by P. Zeeman who, amazingly, was fired from the University of 
Leiden for an unauthorized use of lab equipment for this work - just to receive a Nobel Prize for it in a few years. 
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A = 


— 3xr. 
2 


(6.64) 


Second, let us neglect the terms proportional to which are small in practical magnetic fields of the 
order of a few Tesla. 21 The remaining term in the effective kinetic energy, describing the interaction 
with the magnetic field, is linear in the momentum operator, so that we may repeat the standard classical 
calculation 22 to reduce it to the product of 3 by the orbital magnetic moment’s component m z = - 
eLJ2m e - besides that both m z and L z should be understood as operators now. As a result, the 
Hamiltonian reduces to Eq. (la), // (0) +//'", where /7 1 0 ’ is that of the atom at 3= 0, and 



(6.65) 


Zeeman 

effect’s 

perturbation 


The form of the perturbation immediately reveals the major complication with the Zeeman effect 
description. Namely, in comparison with its contribution (5.198) to the total angular momentum of the 
electron, its spin- 1/2 produces a twice larger contribution into the magnetic moment, so that the right- 
hand part of Eq. (65) is not proportional to the total angular moment. As a result, the effect description is 
simple only in two limits. 


If the magnetic field is so high that its effects are much stronger than the relativistic (fine- 
structure) effects discussed in the last section, we may treat two terms in Eq. (48) as independent 
perturbations of different (orbital and spin) degrees of freedom. Since in the z-basis each of the 
perturbation matrices is diagonal, we can again use Eq. (26): 


Paschen- 

(6.66) Back 

v 7 effect 


This result describes splitting of each 2x(2 / + l)-degenerate energy level, with certain n and /, into (21 
+3) levels (Fig. 7), with the adjacent level splitting of ju b 3, equal to ~10' 23 J ~ 10' 4 eV/T. Note that all 
levels, besides the top and bottom one, remain doubly degenerate. This limit of the Zeeman effect is 
sometimes called the Paschen-Back effect - which simplicity was recognized only in the 1920s, due to 
the need in very high magnetic fields for its observation. 


E -E {0} = — — (( n,l,m l \L z \n,l,m l ) + 2(m s |S z |m s ) 
2m e 

) = — — (hni) + 2 hni ) 
1 2m e 1 s> 

= p B H m , ±1)- 



77(0) 

Ai,/ 



m, 

= + 2 , m s 

= - 1/2 

m ! 

= °> m s 

= + 1/2 

m, 

= +l ,m s 

= - 1/2 

m, 


= + 1/2 

m, 

= 0 , m s 

= - 1/2 

m, 

= - 2 , m s 

= + 1/2 


Fig. 6.7. The Paschen-Back effect. 


21 Despite its smallness, the quadratic term is necessary for description of the negative contribution of the orbital 
motion to the magnetic susceptibility % m (the so-called orbital diamagnetism, see EM Sec. 5.5), whose analysis, 
using Eq. (63), is left for reader’s exercise. 

22 See, e.g., EM Sec. 5.4, in particular Eqs. (5.95) and (5.100). 
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In the opposite limit of low magnetic field, the Zeeman effect takes place on the background of 
the fine structure splitting. As was discussed in Sec. 3, at 3 = 0 each split sub-level has a 2(2/ + l)-fold 
degeneracy corresponding to (21 + 1) different values of the half-integer quantum number rrij, ranging 
from -j to +/, and 2 values of integer / = j + Zi - see Fig. 6. The magnetic field lifts this degeneracy. 23 
Indeed, in the coupled representation discussed in Sec. 5.7, perturbation (48) is described by the matrix 
with elements 


= i — (j,m j \L_ +2S\j',m r ) = f — (j,m \J : +S\j',m 




2m 

6 

e'Q 
2 m, , 


J 2 


e 


2m,, 


j 2 


j j 


pMjSyn m + (j, rrij \S : j\ m f ) I. 


(6.67) 


Now plugging into the last term the Clebsh-Gordan expansions (5.216a) for the bra- and ket-vectors, 
and taking into account that operator S. gives non-zero bra-kets only for m s = m ’ v , matrix (67) becomes 
diagonal, and may again use Eq. (26) to get 


Anomalous 
Zeeman 
effect 
for s = 1/2 



( 6 . 68 ) 


where two signs correspond to the two possible values of / = / + A - see Fig. 8. 


/ < — m ■ = +3/2 

' / 

◄ — ntj = + 1/2 

f'' 

l = j~ 1/2 V'' ◄ — m ■ = -1/2 

\' J 
\ ' 

^ \ 

\ ' < — m j = -3 / 2 


f<°) 

n,j 



>/ 

O' " 


/ = 7 + 1/2 V-. 



< — Mj = +3 / 2 
< — m = + 1/2 

^ m = - 1/2 

^ m . = -3/2 


Fig. 6.8. Anomalous Zeeman effect in a hydrogen-like atom - schematically. 


We see that the magnetic field splits each sub-level of the fine structure, with a given /, into 2/ + 
1 levels, with the distance between the levels depending on /. In the end of the 1890s, when the Zeeman 
effect was first observed, there was no notion of spin at all, so that this puzzling result was called the 
anomalous Zeeman effect. (In this terminology, the normal Zeeman effect is the one with no spin 
splitting, i.e. without the second terms in the parentheses of Eqs. (66)-(68); it may be observed 
experimentally in atoms with the net spin s = 0.) 


23 In almost-hydrogen-like, but more complex atoms (such as those of alkali metals), the degeneracy in / is lifted 
by electron-electron interaction even in the absence of the external magnetic field. 
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The strict quantum-mechanical analysis of the anomalous Zeeman effect for arbitrary s (which is 
important for applications to multi-electron atoms) is not that complex, but requires explicit expressions 
for the corresponding Clebsch-Gordan coefficients, which are rather bulky. Let me just cite the 
unexpectedly simple result of this analysis: 


AE = ju^mjg, 


(6.69) 


where g is the so-called Lande factor: 14 

u j(j + l) + s(s + l)-l(l + l) 

2JU + 1) 


(6.70) 


Anomalous 

Zeeman 

effect 

for arbitrary s 


For s = Vi (and hence j = / ± I/ 2 ), this factor is reduced to the parentheses in the last form of Eq. (68). 


It is remarkable that Eqs. (69)-(70) may be readily derived using very plausible classical 
arguments, similar to those used in Sec. 5.7 - see Fig. 5.11 and its discussion. As we have seen above, in 
the absence of spin, the quantization of observable L z is an extension of the classical torque-induced 
precession of the corresponding vector (say, L) about the magnetic field direction, so that the interaction 
energy, proportional to AL ; = S-L, remains constant (Fig. 9a). At the spin-orbit interaction without 

external magnetic field, the Hamiltonian includes the operator of product S-L, so that it has to be 
quantized, i.e. constant, together with J 2 , L 2 , and S 2 . Hence, this system’s classical image is a rapid 
precession of vectors S and L about the direction of vector J = L + S, so that the spin-orbit interaction 
energy, proportional to product L-S, remains constant (Fig. 9b). On this backdrop, the anomalous 
Zeeman effect in a relatively weak magnetic field 3 = ^i z corresponds to a slow precession of vector J 
(“dragging” the rapidly rotating vectors L and S with it) about axis z. 




Fig. 6.9. Classical images of (a) the 
orbital angular momentum’s quantization 
in external magnetic field and (b) the 
fine-structure level splitting. 


This picture allows us to conjecture that what is important for the slow precession rate are only 
the vectors L and S averaged over the period of the much faster precession about vector J - in other 
words, only their components L j and S j directed along vector J. Classically, these components may be 
calculated as 

< 6 - 71 > 

The scalar products participating in these expressions may be readily expressed via the squared length of 
the vectors, using the following evident fonnulas: 


24 This formula is frequently used with capital letters J, S, and L, which denote the quantum numbers of the atom 
as a whole. 
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S 2 =(J-L) 2 = J 2 +L 2 -2L J, L 2 =(J-S) 2 = J 2 + S 2 -2J S. 
As a result, we get the following time average: 


(6.72) 


i I +2S I =(L,+2S,) I = 


L J 

J 2 


J + 2 


S J 

T 2 * 

J J 


J 


= -|-(L- J + 2S- J) 


J 


(. J 2 +L 2 -S 2 ) + 2(J 2 +S 2 - L 2 ) _ 
2J 2 


1 + 


J 2 + S 2 - L n 
A / 2 


(6.73) 


The last move is to smuggle in some quantum mechanics by using, instead of vector lengths 
squared, and the z-component of J z , their eigenvalues given by Eqs. (5.197), (5.203), and (5.204). As a 
result, we immediately arrive at the exact result given by Eqs. (69)-(70). This coincidence encourages 
thinking about quantum mechanics of angular momenta in classical terms of torque-induced precession, 
and turns out to be very fruitful in more complex problems of atomic and molecular physics. 

The high-field limit and low-field limits of the Zeeman effect, described respectively by Eqs. 
(66) and (68), are separated by a medium field strength range in which the Zeeman splitting is of the 
order of the fine-structure splitting analyzed in Sec. 3. There is no time in this course for a quantitative 
analysis of this crossover. 25 


6.5. Time-dependent perturbations 

Now let us proceed to the case when perturbation H (i) in Eq. (la) is a function of time, while 
H (0} is time-independent. The adequate perturbative approach to this problem, and its results, depend 
critically on the relation between the characteristic frequency (or the characteristic reciprocal time) co of 
the perturbation and the distance between the initial system’s energy levels: 

ha>**\E n -E n \. (6.74) 

In the easiest case when all essential frequencies of a perturbation are very small in the sense of 
Eq. (74), we are dealing with the so-called adiabatic change of parameters, that may be treated 
essentially as a time-independent perturbation (see the previous sections of this chapter). The most 
interesting observation here is that the adiabatic perturbation does not allow any significant transfer of 
system’s probability from one eigenstate to another. For example, in the WKB limit of the orbital 
motion, the Bohr-Sommerfeld quantization rule (2.110), and its multi-dimensional generalization, 
guarantee that integral 

j>p ■ t/r , (6.75) 

c 

taken along the particle’s classical trajectory, is an adiabatic invariant, i.e. does not change at a slow 
change of system’s parameters. (It is curious that classical mechanics also guarantees the invariance of 
integral (75), but its proof there 26 is much harder than the quantum-mechanical derivation of this fact, 


25 For a more complete discussion of the Stark, Zeeman, and fine- structure effects in atoms, I can recommend, for 
example, either the monograph by G. Woolgate cited above, or the one by I. Sobelman, Theory of Atomic Spectra, 
Alpha Science, 2006. 

26 See, e.g., CM Sec. 10.2. 
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carried out in Sec. 2.4.) This is why even if the perturbation becomes large with time (while changing 
sufficiently slowly), we can expect the eigenstate and eigenvalue classification to persist. 

Now let us proceed to the more important (and more complex) case when both sides of Eq. (74) 
are comparable, and use for its discussion the Schrodinger picture of quantum mechanics given by Eqs. 
(4.157) and (4.158). Combining these equations, we get the Schrodinger equation in the form 

ih^\a{t)) = {H™ + H m (t)}a(t)). (6.76) 

Very much in the spirit of our treatment of the time-independent case in Sec. 1, let us represent the time- 
dependent ket-vector of the system with its expansion, 

\ a ( t )) = Yj\ n )( n \ a (0)> (6.77) 

n 

over the full and orthonormal set of the unperturbed, stationary ket-vectors defined by equation 

H (0) \n) = E n \n), (6.78) 

where bra-kets (n\a(t)) are time-dependent coefficients. Plugging expansion (77), with n replaced with 
n ’, into both sides of Eq. (76), and then inner-multiplying both its parts by bra-vector (n\ of another 
unperturbed (and hence time-independent) state of the system, we get a set of linear, ordinary 
differential equations for the expansion coefficients: 

ifl ~Jt ^ I = E " ^ 1 I + 5 H ’’"' ' I ( 6 - 79 ) 

where the matrix elements of the perturbation in the unperturbed state basis, defined similarly to Eq. (7), 
are now functions of time: 

H%(t) = {n\H m (t)\n'). (6.80) 

The set of differential equations (79), which are still exact, may be useful for numerical 
calculations, because for virtually all practical problems the set of eigenstates n ’ may be restricted with 
an acceptable error in the final result. 27 However, Eq. (79) has a certain technical inconvenience, which 
becomes clear if we consider its (evident) solution in the absence of perturbation: 28 

(n\a(t)) = (n|«(0))exp|-z' : ^ L r|. (6.81) 

We see that the solution oscillates very fast, and its numerical modeling may present a challenge for 
even fastest computers. These spurious oscillations (whose frequency, in particular, depends of the 
energy reference level) may be partly tamed by looking for the general solution of Eqs. (79) in a fonn 
inspired by Eq. (81): 


27 Even if the problem under analysis may be described by the wave-mechanics Schrodinger equation (1.25), a 
direct numerical integration of that partial differential equation is typically less convenient than that of the 
ordinary > differential equations (79). 

28 This is of course just a more general form of Eq. (1.61) of wave mechanics of time-independent systems. 
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General 

equations 

for 

probability 

amplitude 

evolution 


Turning on 
sinusoidal 
perturbation 


|«(0) = a„(0expj- 



(6.82) 


Here a n (t ) are new functions of time (essentially, the stationary states’ probability amplitudes), which 
may be used, in particular, to calculate the time-dependent level occupancies, i.e. the probabilities W n to 
find the perturbed system on the corresponding energy levels of the unperturbed system: 


W n (t) = \(n\a(t))\ =\ a M- 


(6.83) 


Plugging Eq. (65) into Eq. (79), for these functions we readily get a slightly modified system of 
equations: 


(6.84) 



where factors a> nn -, defined by relation 


hco,, = E„ - E. 


(6.85) 


have the physical sense of frequencies of potential quantum transitions between the n- th and n ’-th 
energy levels of the unperturbed system. (The conditions when such transitions indeed take place will be 
discussed later in this chapter.) An advantage of Eq. (84) over Eq. (79) for numerical calculations is the 
absence of any dependence on the energy reference selection, and lower frequencies of oscillations of 
the right hand part terms, especially when the energy levels of interest are close to each other. 

In order to continue our analytical treatment, let us restrict ourselves to a particular but very 
important case of a sinusoidal perturbation turned on at some moment - for example, at t = 0: 


H (1 \t) = - 


[0. 

\Ae 


for t < 0, 
~ ieot +A f e +iat , for t>0. 


( 6 . 86 ) 


/v /\ 4- 

where the perturbation amplitude operators A and A ' , and hence their matrix elements 

n\A\ n ') = A n n,, {n\A^\n') = A* n , n , 


are time-independent. 29 In this case, for t> 0, Eq. (84) yields 

ifid n =]Ta„, 


, i(co„„-a>)t * i(co„„,+co)t 

A.e nn +A„,e nn 


(6.87) 


( 6 . 88 ) 


This is, generally, still a complex system of coupled differential equations; however, it allows 
simple and explicit solutions in two very important cases. First, let us assume that our system is initially 
in one eigenstate n ’ (say, on the ground energy level), and that the occupancies W n of all other levels 
stays very low all the time. (We will find the corresponding condition a posteriori - from the solution.) 
With the corresponding assumption 


29 The notation of the amplitude operators in Eq. (86) is justified by the fact that the perturbation Hamiltonian has 
to be self-adjoint (Hermitian), and hence each term in the right-hand part of that relation has to be a Hermitian 
conjugate of its counterpart, which is evidently true only if the amplitude operators are also the Hermitian 
conjugates of each other. Note, however, that each of the amplitude operators is generally not Hermitian. 
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a , = 1; a « 1 , for n ^ li 


(6.89) 


Eq. (88) may be readily integrated, giving 


A.., 


a„ = - 




JioJnn.-Oj) 1 _ i 




J + ®)L 




for n ^ n 


(6.90) 


We see that the probability W„ (83) of finding the system on each energy level of the system oscillates in 
time, and that our assumption (89) is satisfied as soon as the excitation amplitude is not too large, 30 


\A. «hco±(o n 

| nn n, 


(6.91) 


Expression (90) also shows that this phenomenon has a clearly resonant character: the maximum 
occupancy W n of a level grows infinitely when the corresponding detuning , 31 


A nn =0)~0) nn ,. 


(6.92) 


tends to zero. In this limit, our initial assumption (89) may become a liability; in order to overcome it we 
may perform the following trick - very similar to the one we used for transfer to the degenerate case in 
Sec. 1. Let us assume that for a certain level n, 

|A„„-| « co,\co±(o n „ n \,\a>± co nV |, for all n"*n,n' (6.93) 

- the condition illustrated in Fig. 10. Then, according to Eq. (90), we may ignore the occupancy of all 
but two levels, n and n ’, and also the second, non-resonant terms with frequency co nn ■ + co « 2 co » |A„„-| 
in Eqs. (88) written for a„ and a„ . 32 


• • • 



-> 0 


Fig. 6.10. Resonant excitation of 
one of the higher energy levels. 


As a result, in this two-level approximation (that is of course not an approximation at all for two- 
level systems, such as spin-14 - see Sec. 5.1), we get a simple system of two linear equations: 


ihd n = a n ,Ae 
ihd n ,=a n A*e +iAt , 


(6.94) 


30 Strictly speaking, another condition is that the number of “resonant” levels is also not too high - see Sec. 6. 

31 The notion of detuning is also very useful in the classical theory of oscillations - see, e.g., CM Chapter 4. 

32 Such omission of non-resonant terms is usually called the Rotating Wave Approximation (RWA); it is very 
instrumental not only in quantum mechanics, but also in the classical theory of oscillations - see, e.g., CM Secs. 
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Rabi 

oscillations' 

(half-) 

frequency 


where I have used shorthand notation A = A nn and A = A nn ■ - and will use it for a while - until other 
energy levels become involved (in the beginning of the next section). This system of linear differential 
equations may be solved exactly by the introduction of a new variable (for one of the levels only!) 

b n =a n e +lAt . (6.95) 


According to this formula, 

“,=b n e- m , (6.96) 

Plugging these relations into Eq. (94), we see that both equations of the system loose their explicit time 
dependence: 

ih(b n -iAb n )=a n ,A, iha n , =b n A* , (6.97) 

and now may be readily solved by regular methods. For example, we may differentiate the first 
equation, and then use the second one to eliminate variable a n ■: 

b A* \A\ 2 

m n -iAb n ) = a n ,A = -A— A = b n LL (6.98) 

in in 

From mathematics we know that the resulting linear, second-order differential equation, with 
time-independent coefficients, has the following general solution, 

b n (t) = b + e Kt +be At , (6.99) 


whose characteristic exponents A may be readily found by plugging any of the exponential functions 
into Eq. (98). In our case, both roots of the resulting characteristic equation, 


A — iAA + 


Ml 2 


= 0 , 


( 6 . 100 ) 


are purely imaginary: A± = i(A/2 ± Q), where 



( O 1 

1/2 


A 2 U 


Q = 

+ ' 



N 

4^ 


v / 


( 6 . 101 ) 


The coefficients b± are determined by initial conditions. If, as before, the system was completely 
on level n ’ initially, i.e. a„- (0) = 1, a„(0) = 6„(0) = 0; then Eq. (99) immediately yields b. = - b+, so that 

b n (0 = 2 ib + e iAtl2 sinQ t, a n (t) = 2ib + e~ iAtl2 sinQ t, d n (0) = 2 ib + Q . (6. 102) 

Now the coefficient b+ may be readily found from the comparison of the last equality in Eq. (102) with 
the first of Eqs. (94), taken for t = 0, when a n -= 1 . This comparison yields 2 ib+Cl= A/ih, and hence 

a n (t) = -—e~ iAtl2 sinnt, (6.103) 

fiO. 

so that the n th level occupancy is 
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W=\a\ =■ 


A 


h 2 n 2 


-sirr Q t = 


A 


\A\ 2 +(M/2) 2 


-sin" fit . 


(6.104) 


Rabi 

formula 


This is the famous Rabi formula , 33 It shows that an increase of the perturbation amplitude \A\ 
leads not only to an increase of the amplitude of the probability oscillations, but also of their frequency 
2Q described by Eq. (101) - see Fig. 11. 


Fig. 6.11. Rabi oscillations. 

0 0.2 0.4 0.6 0.8 

t/( 2W|A|) 

Ultimately, at \A \ » H\A\ (for example, at the exact resonance, A = 0) Eqs. (1 0 1)-( 1 02) give Q = 
\A\/h and (W„) ma x = 1, i.e. describe a periodic, full “repumping” of the system from one level to another 
and back, with a frequency proportional to the perturbation amplitude. This effect gives a very 
convenient tool for manipulating two-level-systems (qubits, in the quantum information context). For 
example, limiting the external excitation time to At = Jil2fl (or an odd number of such intervals) we may 
completely transfer the system from one eigenstate (say, >1) to the opposite one (T). 34 On the Bloch 
sphere (Fig. 5.1), this transfer corresponds to the representing point’s drive from the South Pole to the 
North Pole. 

Note, however, that according to Eq. (90), if the system has energy levels other than n and n 
they also become occupied to some extent. Since the sum of occupancies should be 1, this means that 
(IFn)max may approach 1 only if the excitation amplitude is very small, and hence the state switching 
time At = 7i/2Q = 7ih/2\A\ is very long. The ultimate limit in this sense is provided by the harmonic 
oscillator where all energy levels are equidistant, and probability repumping between all of them occurs 
with the same rate. Hence, in that particular system, the implementation of the full Rabi oscillations is 
impossible even at the exact resonance. 35 In the opposite limit, when the detuning is large in comparison 
with \A\/Pi, though still small in the sense of Eq. (93), the frequency of Rabi oscillations is completely 
determined by the detuning, and their amplitude is small: 



33 It was derived in 1952 by I. Rabi, in the context of his group’s pioneering experiments with microwave 
excitation of quantum states, using molecular beams in vacuum. 

34 In the quantum information science language, this is just a logic operation NOT performed on a single qubit. 

35 We, of course, already know what happens to the ground state of an oscillator at its external sinusoidal (or any 
other) excitation: it turns into the Glauber state, i.e. a superposition of all Fock states - see Sec. 5.5. 
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W„(t) = 4 


w 2 


n 2 a 2 


I 1 9 

sin 2 — « 1, for \A «(M)" 


(6.105) 


However, I would not like these quantitative details to obscure from the reader the most 
important qualitative (OK, maybe semi-quantitative :-) conclusion of this section’s analysis: the 
resonant increase of interlevel transition intensity at co — > co nn ■. Using the fundamental Kramer-Kronig 
dispersion relations , 36 based essentially only on very general causality arguments, it is easy to show 
(and hence left for reader’s exercise) that in a medium incorporating many similar quantum systems 
(e.g., atoms or molecules), this increase of quantum transitions is accompanied by a sharp increase of 
external field’s absorption. This effect has numerous practical applications including systems based on 
the electron paramagnetic resonance (EPR) and nuclear magnetic resonance (NMR) spectroscopies, 
which are broadly used in material science, chemistry, and medicine. Unfortunately, I will not have time 
to discuss the related technical issues (in particular, interesting pulsing spectroscopy techniques) in 
detail, and have to refer the reader to special literature. 37 


6.6. Quantum-mechanical Golden Rule 

The last result of the past section, Eq. (105), may be used to derive one of the most important 
results of quantum mechanics - its so-called Golden Rule. For that, let us consider the case when the 
perturbation causes quantum transitions from a discrete energy level E n into a group of eigenstates E n 
with a dense (virtually continuous) spectrum - see Fig. 12a. If, for all states n of the group, the 
following conditions are satisfied 

\Aj «(«w)’«Kf (6.106) 


then Eq. (105) coincides with the result that would follow from Eq. (90). This means that we may apply 
Eq. (105), with indices n and n ’ duly restored, to any level n of our tight group. As a result, the total 
probability of having our system transferred from level n ’ to that group is 


n 



A., 


A. 


-sm" 


A ,t 


(6.107) 


E 


n 


E 


n' 



(b) 


Fig. 6.12. Deriving the Golden 
Rule: (a) the energy level 

scheme, and (b) the function 
under integral (108). 


36 See, e.g., EM Sec. 7.3, in particular, the correspondence between Eqs. (7.55) and (7.56). 

37 For introductions see, e.g., J. Wertz and J. Bolton, Electron Spin Resonance, 2 nd ed., Wiley, 2007; J. Keeler, 
Understanding NMR Spectroscopy, 2 nd ed., Wiley, 2010. 
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Now comes the main, absolutely beautiful trick: let us assume that the summation over n will be 
limited to a tight group of very similar states for which the matrix elements A nn - are virtually similar (we 
will check the validity of this assumption later on), so that we can take it out of the sum (107) and then 
replace the sum with the corresponding integral: 




4A„ 


-ii 


A.t 


-sin' 


dn = 


4 \Am' 

2 

PJ 

r 1 2 

fA A 

nn 

tl J 


l 2 J 




(6.108) 


where p n is the density of eigenstates n on the energy axis: 


dn 



(6.109) 


This density, as well as the matrix element A nn ■, have to be evaluated at A„„- = 0, i.e. at energy E n = E n - + 
ft co, and are assumed to be constant within the finite state group. At fixed E n •, the function under integral 
(108) is even and decreases fast at \A nn -t\ » 1 - see Fig. 12b. Hence we may introduce a dimensionless 
integration variable c = A n „ t, and extend integration over this variable formally from -oo to +oo. Then 
Eq. (108) is reduced to a table integral, 38 and yields 


Wz(t) = 



( 6 . 110 ) 


where constant 


is the called the transition rate? 9 


r = 


2n \ . ,2 

. \^nn' Pn * 
n 


( 6 . 111 ) 


This is one of the most famous and useful results of quantum mechanics, its Golden Rule 
(sometimes, rather unfairly, called the “Fermi Golden Rule” 40 ), which deserves much discussion. First 
of all, let us reproduce the reasoning already used in Sec. 2.5 to show that the meaning of rate T is much 
deeper than Eq. (110) seems to imply. Indeed, due to the conservation of the total probability, W„- + W% 
= 1 , we can rewrite that equation as 

W,l_ 0 =-T. (6.112) 


Evidently, this result cannot be true for t — » oo, otherwise probability W„- would become negative. The 
reason for that apparent contradiction is that result (110) was obtained in the assumption that initially 
the system was completely on level n ’: W n (0) = 1 . Now, if in the initial moment the value of W„ ■ is 


38 See, e.g., MA Eq. (6.12). 

39 In some texts, the density of states in Eq. (Ill) is replaced with expression Z d{E„ - E n - - tied). Indeed, the 
integration of this expression over any finite energy interval A E„ gives the same result An = (dn/dE„)AE„ = p n AE n 
as Eq. (111). Such replacement may be useful in some cases, but should be used with utmost care, and for most 
applications the more explicit form (1 1 1) is preferable. 

40 Actually, this result was developed mostly by the same P. A. M. Dirac in 1927; E. Fermi’s role was not much 
more than advertising it, under the name of “Golden Rule No. 2”, in his lecture notes on nuclear physics, which 
were published much later, in 1950. (To be fair to Fermi, he has never tried to pose as the Golden Rule’s author.) 
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Golden 

Rule’s 

validity 


different, result (110) has to be multiplied by that number, due to the linear relation (88) between dajdt 
and a n \ Hence, instead of Eq. (112) we get a differential equation similar to Eq. (2.159), 

W n .=-T W n „ (6.113) 


which, for time-independent T, has the evident solution, 

W„,(t) = WJ 0)e~ rt , 


describing an exponential decay of the initial state’s occupancy, with time constant r = 1/r. 


(6.114) 


I would ask the reader to think again about this fascinating mathematical result: by summation of 
periodic oscillations (105) over many levels n, we have got an exponential evolution (114) of the 
probability. The main trick here is of course that the effective range AE of states E n , giving the 
dominating contribution into integral (108), shrinks with time: A E n ~ fi/t. 41 By the way, since most of the 
decay takes place at times t ~ z = 1/r, the range of participating final energies may be estimated as 


AE 


n 



(6.115) 


This estimate is very instrumental for the formulation of conditions of validity of the Golden Rule (111). 
First, we have assumed that the matrix elements of the perturbation and the density of states do not 
depend on energy within interval (115). This gives the following requirement 

AE n ~hr«E n -E n ,~hco, (6.116) 


Second, for the transfer from sum (107) to integral (108), we need the number of states within that 
energy interval, A N„ = p„AE n , to be much larger than 1. Merging Eq. (116) with Eq. (93) for all energy 
levels n ” ^ n, n ’ not participating in the resonant transfer, we may summarize all conditions of the 
Golden Rule validity as 


p n 1 « hT « h co ± co lt 


(6.117) 


(The reader may ask whether I have forgotten the condition expressed by the first of Eqs. (106). 

2 2 

However, for A, m - ~ A EJTi ~ T, this condition is just \A nn \ « (tiV) , so that plugging it into Eq. (Ill), 


r« 


v(» r J7v 

n 


(6.118) 


and canceling one T and one fr, we see that this requirement coincides with the left relation in Eq. (117) 
above.) 

Let us have a look at whether these conditions may be satisfied in practice, at least in some 
cases. For example, let us consider the optical ionization of an atom, with the released electron confined 

T Z T 

in a volume of the order of 1 cm = 10' m . According to Eq. (1.82), with E of the order of the atomic 
ionization energy E n - E m = hco ~ 1 cV, the density of electron states in that volume is of the order of 
10 17 1/eV. Thus conditions (117) provide an approximately 15-orders-of magnitude range for acceptable 


41 Here we have run again, in a more general context, into the “energy-time uncertainty relation” which was 
already discussed in the end of Sec. 2.5. Let me advise the reader to revisit that important discussion. 
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values of fiT. This illustration should give the reader a taste of why the Golden Rules is applicable to so 
many situations. 

Finally, the physical picture of initial state’s decay (which will also be the key for our discussion 
of quantum mechanics of “open” systems in the next chapter) is also very important. According to Eq. 
(114), the external excitation transfers the system onto the continuous spectrum of levels n, and it never 
comes back on the initial level n However, it was derived from quantum mechanics of Hamiltonian 
systems, whose equations are invariant with respect to time reversal. This paradox is a result of the 
generalization (113) of the exact result (112), that breaks the time reversal symmetry, but is absolutely 
adequate for the physics under study. Some gut feeling of the physical sense of this irreversibility may 
be obtained from the following observation. From our wave-mechanics experience, we know that the 
distance between adjacent orbital energy levels tends to zero only if the system size goes to infinity. 
This means that the assumption of continuous energy spectrum of finite states n essentially requires 
these states to be infinitely extended in space - essentially being free de Broglie waves. The Golden 
Rule approach corresponds to the (physically justified) assumption that in an infinitely large system the 
traveling waves excited by a local source and propagating outward from it, would never come back, and 
even if they do, the unpredictable phase shifts introduced by the uncontrollable perturbations on their 
way would never allow them to sum up in the way necessary to bring the system back into the initial 
state n ’. 42 

Maybe the best illustration of this interpretation is given by the following problem - which is a 
toy model of the photoelectric effect that was briefly discussed in Sec. l.l(iii). A ID particle is initially 
trapped in the ground state of a narrow quantum well, 

U(x) = -m(x). (6.119) 


Let us use the Golden Rule to find rate Y of particle’s “ionization” (i.e. its excitation into an extended, 
delocalized state) by a weak classical sinusoidal force of amplitude F 0 and frequency ox As a reminder, 
finding the initial, localized state (n ’) of such particle was the task of Problem 2.14, and its solution was 


wA x ) = kU1 ex p{ _ K = 


nrU) 

fi 1 


2 .2 


E. = - 


h k 
2m 


m'U ) 1 

~2h Y 


(6.120) 


Extended states n with continuous spectrum, for this problem exist only at energies E n > 0, so that the 
excitation rate is different from zero only for frequencies 


oo > 0), = 


E , nrU)- 


n 


2/z 


The weak sinusoidal force may be described by the following perturbation Hamiltonian, 


E[ n) = -F(t)x = -F 0 xcoscot = - — x e lcvl +e 


F n 


i cot . — icot 


for t > 0 . 


(6.121) 


(6.122) 


so that according to Eq. (86), that serves as the amplitude operator definition, in this case 


42 This situation is very much similar to the entropy increase in macroscopic systems, which is postulated in 
thermodynamics, and justified in statistical physics, even though it is based on time-reversible laws of mechanics 
- see, e.g., SM Sec. 1.2 and Sec. 2.2. 
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A = A f =-^x. (6.123) 

Now the matrix elements A nn - that participate in Eq. (Ill) may be calculated in the coordinate 
representation: 

+00 +C0 

An' = J wl{x)A{x)y/ n ,{x)dx = -y J i/s*(x)xi/s n ,(x)dx . (6.124) 


Since, according to Eq. (120), the initial y/ n ’ is a symmetric function of x, a nonvanishing 
contribution to this integral is given only by asymmetric functions ys„(x), proportional to sin k„x, with 
wavenumber k n related to the final energy by the well-familiar equality (1.77): 

h 2 k 2 

~^ = E n . (6.125) 

2m 

As we know from Sec. 2.5 (see in particular Eq. (2.124) and its discussion), such asymmetric functions, 
with ^,,(0) = 0, are not affected by the zero-centered delta- functional potential (119), and their density 
p„ is the same as in a completely free space, and we can use Eq. (1.94). (Actually, since that relation was 
derived for traveling waves, it is more prudent to repeat the calculation that has led to that result, 
confining the waves on an artificial segment [-7/2, +1/2] - so long, 

k n l,td » 1 , (6.126) 


that it does not affect the initial localized state and the excitation process. Then the confinement 
requirement y/ n (±l/2) = 0 immediately yields the condition k„l/2 = nrc, so that Eq. (1.94) is indeed valid, 
but only for positive values of k n , because sin k n x with k n — » -k n does not give an independent standing- 
wave eigenstate.) Hence the finite state density is 


dn _ dn dE n _ / ^k 2 k n _ Im 
dE n dk n dk n 2n m 2 7th 2 k n 


(6.127) 


It may look troubling that the density of states depends on artificial segment’s length /, but the 
same / also participates in the final wavefunction normalization factor, 43 


Vn = 


2 


sin k n x , 


(6.128) 


and hence the matrix element (124): 


A = _5l 

"" 2 


f2K^ n ^ 


v ' J 


| sin k n x e 


K X ^xdx. = - 

2 i 


2k 

v 7 j 


j e iik "- K)x xdx-\ e~ (ik » +,c)x xdx 


Vo 


. (6.129) 


These two integrals may be readily worked out by parts. Taking into account that, according to 
condition (126), their upper limits may be extended to oo, the result is 


43 The nonnalization to infinite volume, using Eq. (5.55), is also possible, but less convenient in such problems. 
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(6.130) 


so that finally we get an expression for the rate, which is independent of the artificially introduced /: 

-i 2 
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(6.131) 


Note that due to the above definitions of k n and /c the expression in parentheses in the 
denominator of the last formula does not depend on the quantum well parameter W, and is a function of 
only the excitation frequency co (and particle’s mass): 


*\kl+K 2 ) 

2m 


= E„ - E. = hco. 


(6.132) 


As a result, Eq. (131) may be recast simply as 


hr = 


F&X 

2 (hco) 4 ' 


(6.133) 


What is still hidden here is that k n , defined by to Eq. (125) with E n = E n + hco, is a function of 
frequency, changing as co at co » co, (so that T drops as co at co -o- oo), and as ( co - co,) when co 
approaches the “red boundary” co t = \E n \/h = m^/2h 3 of the ionization effect, so that T oc (co - co,) U2 — > 
0 in that limit as well. We see that our toy model does describe this main feature of the photoelectric 
effect, whose explanation by Einstein was essentially the starting point of quantum mechanics - see Sec. 
1.1. The (very similar) analysis of this effect in a more realistic model, the hydrogen atom’s ionization, 
is left for reader’s exercise. 


6.1 . Golden Rule for step-like perturbations 


Now let us reuse some of our results for a perturbation being turned on at t = 0, but after that 
time-independent: 

(6.134) 



Step-like 

perturbation 


A superficial comparison of this equation and our former Eq. (69) seems to indicate that we may use all 
our previous results, taking co = 0. However, that conclusion does not take into account the fact that 
analyzing both the two-level approximation and the Golden Rule for continuous spectrum, we have 
neglected the second (non-resonant) term in Eq. (90). This why it is more prudent to use the general Eq. 
( 86 ), 

ihd n = ^ a n , Hm nn'e lC ° nn ' t , (6.135) 


in which the matrix element of the perturbation is now time-independent. We see that it is formally 
equivalent to Eq. (88) with only the first (resonant) term kept, if we make the following replacements: 
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A^H 


(i) 


A . = co-co,,, — > -co„ 


(6.136) 


As a sanity check, let us revisit a two-level system such as two quantum wells coupled by 
tunneling - see Fig. 13a. It is convenient to include the energy difference E n - E n between the two levels 
into the unperturbed Flamiltonian, so that perturbation (134) describes only the localized state coupling 
due to tunneling through the energy barrier separating the wells. (The turning on of the coupling, 
described by Eq. (134), may be achieved, for example, by a rapid lowering of the barrier at t = 0.) Then, 
after replacements (136), we are getting an analog of Eq. (104): 


W. = \a\ = 


H {1) , 
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nn 



fi 2 Q 2 


sin 2 Clt , 


(6.137) 


where frequency Q of the periodic “probability repumping” between levels n ’ and n is now described, 
instead of Eq. (104), by relation 


2Q = 


col+4'- 
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(i) 


|2 A 


1/2 


= r[(£,-£..) 2 +4ff£Ml 

n 


1/2 


(6.138) 


But these are exactly the quantum oscillations that have already been discussed in Sec. 2.6 - now 
derived for an arbitrary quantum wells and tunnel barrier shape. 



Fig. 6.13. Quantum-well implementation of coupling of a discrete-energy state n ’ to (a) another 
discrete-energy state, and (b) a state continuum, due to tunneling through a potential barrier. 


The similarity of Eqs. (104) and (137) shows that the Rabi oscillations and the “usual” quantum 
oscillations have essentially the same physical nature, besides that in the former case the external rf 
signal quantum ho bridges over the state energy difference. We may also compare result (138) with our 
analysis of a two-level system, with a similar time-independent perturbation, in Sec. 1. According to Eq. 
(29), its eigenenergies differ by 

E + -E_= [(//„ -H 22 ) 2 +4 H xl H 2l }’ 2 . (6-139) 

But this is exactly the result given by Eq. (138), provided that we consider (H n - II 22 ) as the difference 
(E„ - E n ') of unperturbed state energies rather than as a perturbation, as we certainly have a right to do. 

Now let us consider the effect of perturbation (134) in the case when it creates coupling between 
the initial (discrete) energy level and a dense group of states with a quasi-continuum spectrum, in the 
same energy range. Figure 13b shows an example of such a system: a quantum well separated by a 
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penetrable tunnel barrier from an extended region with a quasi-continuous energy spectrum. Making 
replacements (136) in Eq. (Ill), we may present the Golden Rule for this case as 


n 1 


Pn 


(6.140) 


where states n and n ’ now have the same energy. 44 

It is very informative to compare this result with Eq. (138) for a symmetric (E n = E n ) double 
quantum well using the same tunnel barrier - see Fig. 13. For the latter case, Eq. (138) yields 

n = • (6.141) 

ft I Icon 


Here I have used index “con” (from “confinement”) to emphasize that this matrix element is rather 
different from the one participating in Eq. (140). Indeed, in the latter case, the matrix element, 


h ( :> = 


n\H (l) \n') = | y/* n ,H {X) y/ n dx . 


(6.142) 


has to be calculated for two similar wavefunctions y/ n and <//„■ confined to spatial intervals of the same 
scale l con , while in Eq. (140), wavefunctions y/ n are extended to a much larger distance / » / con - see 
Fig. 13. As Eq. (129) tells us, in the ID model we are considering now, this means an additional factor 
small factor of the order of ( l C0 Jl ) • Now using Eq. (128) as a crude but suitable model for the finite- 
state wavefunctions, we arrive at the following estimate: 
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*T~2xH% ^ Pn ~2nH 
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Im 

l 2nh 2 k„ 
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AE, 


(mf 

AE, 


(6.143) 


where AE n - ~ ft 2 /ml 2 con is the scale of the differences between eigenenergies of the particle in an 
unperturbed quantum well. Since the condition of validity of the perturbative formula (140) is ftD. « 
A E n ■, we see that 45 

ftcy 

hr ~ - — m « na . ( 6 . 144 ) 

AE 

m 


Hence the rate of (irreversible) quantum tunneling into continuum is always much lower that the 
frequency of (reversible) quantum oscillations between states separated with the same potential barrier - 
at least for the case when both are much lower than A E n •/ h , so that the perturbation theory is valid. A 
handwaving interpretation of this result is that the confined particle wonders beyond the barrier and 
back many times before finally “deciding” to perfonn an irreversible transition into unconfined 
continuum. 46 


44 The condition of its validity is again given by Eq. (117), but with a> — > 0 in the upper limit. 

45 It is straightforward to show that in this form, the estimate is valid for a similar problem of any spatial 
dimensionality, not just the ID case we have analyzed. 

46 This qualitative picture may be verified, for example, using the experimentally observable effects of dispersive 
electromagnetic environment on electron tunneling - see P. Delsing et al., Phys. Rev. Lett. 63, 1180 (1989). 
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Let me conclude this section (and this chapter) with the application of Eq. (140) to an important 
case, which will provide us with a smooth transition to the next chapter’s topics. Consider a composite 
system consisting of two parts, a and b, with the energy spectra sketched in Fig. 14. 

system a system b 


Fig. 6.14. Energy relaxation in 
system a due to its coupling with 
system b (which serves as the 
environment of a). 


interaction 

tieo ◄ ► 

H m = A(a)B(b) 


Let the systems be completely independent initially. The independence means that in the absence 
of perturbation, the total Hamiltonian of the system at t < 0 may be presented as a sum 

H (0) = H a (a) + H b (b), (6.145) 

where arguments a and b symbolize the non-overlapping sets of variables of the two systems. Then 
eigenkets of the system may be naturally factored as 47 

\n) = \n a )®\n b ), 

while its eigenenergies separate into a sum, just as the Hamiltonian (145) does: 

# <0) | n) = [H a + H b ) n a ) 0 1 n b ) = (H a \ n a ))® | n b ) + (h„ \ n h ))® | n a ) 

= (- E na I n a )) ' ® | ■ ) + ( E nb \ )) ' ® | ■ ) = ( E na + E nb )| ») • 

Analysis of such a composite system is much easier when the interaction of its components may 
be presented as a product of two Hennitian operators, each depending only on the degrees of freedom of 
only one component system: 

= A(a)B(b) . (6.148) 

A typical example of such a bilinear interaction Hamiltonian is the electric-dipole interaction between 
an atomic-scale electron system (with a size of the order of the Bohr radius re ~ 10' 10 m) and the 
electromagnetic field at optical frequencies a> ~ 10 16 s' 1 , with wavelength A = 2nd a ~ 10" 6 m » r B : 48 

H' (1) = -d e, with d = » (6.149) 

k 

where the dipole electric moment d depends only on positions iy of charged particles (numbered with 
index k), while that of electric field 3 is a function of only the electromagnetic field’s degrees of 
freedom - see Chapter 9 below. 


(6.146) 


(6.147) 


47 Sign <8> is used to denote the formation of a joint ket-vector from kets of independent systems (“belonging to 
different Hilbert spaces”). Evidently, the order of operands in such a “product” may be changed at will. 

48 See, e.g., EM Sec. 3. 1, in particular Eq. (3.16), in which letter p is used for the electric dipole moment. 
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Returning to the general situation shown in Fig. 14, if the component system a was initially in an 
excited state n’ a , interaction (148) may bring it to another discrete state n a of a lower energy - for 
example, the ground state. In the process of this transition, the released energy, in the form of energy 
quantum 

fico = E n , a —E m , (6.150) 


is picked up by system b: 


E nb ~ E n'b = 


(6.151) 


(In typical applications, though not always, the initial state n ’/, of that system is its ground state.) If the 
finite state n* of the system is inside a state group with quasi-continuous energy spectrum (Fig. 14), the 
process has the exponential character (114) 49 and may be interpreted as the effect of energy relaxation 
of system a, with the released energy quantum hco absorbed by system b. Note that since the quasi- 
continuous spectrum essentially requires a system of large spatial size, such model is very convenient 
for description of the environment of system a. (In physics, the “environment” typically means all the 
Universe less the system under consideration.) 


The relaxation rate T may be described by the Golden Rule. Since perturbation (148) does not 
depend on time explicitly, and the total energy of the composite system does not change, we may use 
Eq. (140) that, with the account of Eqs. (146) and (148), takes the form 





where A m , = (n a \A\n' a ), B m , = (n b \B\n ' b ), 


(6.152) 


Golden 

Rule 

for coupled 
systems 


with p n being the density of states of the finite states of system b, at the relevant energy E„b = E n ■/, + ti to 
= E n ’ b + (E n 'a - E na ). In particular, Eq. (152), with the dipole Hamiltonian (149), enables a very simple 
calculation of the natural linewidth of atomic electric dipole transitions. However, such calculation has 
to be postponed until Chapter 9 in which we will discuss the electromagnetic field quantization - i.e., the 
exact nature of states nb and n - b for this problem. Instead, I will proceed to a discussion of the effects of 
interaction of quantum systems with their environment, toward which the situation shown in Fig. 14 
provides a clear path. 


6.8. Exercise problems 

6.1 . Use Eq. (13) to prove the Hellmann-Feynman theorem: 50 


3E n , I dH , 

— - = (n\ \n 

5 A X 1 SA 1 


where A is an arbitrary c-number parameter, and then use this theorem to prove the first of Eqs. (3.201). 


6.2 . Analyze the relation between Eq. (15) and the results of classical analysis 51 of a similar 
anharmonic (“nonlinear”) oscillator. 


49 The process is evidently spontaneous, i.e. does not require any external agent, and starts as soon as either the 
interaction (127) has been turned on, or (if it is always on) as soon as system a is placed into the excited state n ’ a . 

50 As a reminder, its proof for the particular case of wave mechanics was the subject of Problem 1.4. 
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6.3 . A weak additional force F is applied to a ID particle that was placed into a hard-wall 
quantum well with 

[0, for 0 < x < a, 

I + oo, otherwise. 


U(x) = 


Calculate, sketch, and discuss the first-order perturbation of its ground-state wavefunction. 


6.4 . A 2D quantum particle is confined in a square-shaped quantum well with infinitely high 
walls and slightly skewed floor: 


U = 


\ fjxy, for 0 < x < L and 0 <y <L, 
[ + oo, otherwise. 


In the first order in the small parameter //, find energies of the ground state and the lowest excited state 
of the system. Formulate the conditions of validity of your result. 

Hint : To save reader’s time on a straightforward but longish integration by parts, I can offer the 
following integral: 

J sin(^) sin(2/rc) c dq = 


6.5 . Calculate the lowest-order relativistic correction to the ground-state energy of a ID 
harmonic oscillator. 

6.6 . AID particle of mass m is localized at a narrow potential well which may be approximated 
with a delta-function: 

U{x) = -'Md(x), with^>0. 

Calculate the change of its ground state energy by an additional weak, time-independent force F, in the 
first nonvanishing approximation of the perturbation theory. Discuss the limits of validity of this result, 
taking into account that at F 0, the localized state of the particle is metastable. 

6.7 . Use the perturbation theory results to calculate the eigenvalues of the observable L ", in the 
limit / « \m\ » 1, by purely wave-mechanical means. 

Hint : Try the following substitution: 0(0) = j(0)/sin 1 0 . 

6.8 . In the first nonvanishing order of the perturbation theory, calculate the shift of the ground- 
state energy of an electrically charged spherical rotator (i.e. a particle of mass m, free to move over a 
spherical surface of radius R ) due to a weak, uniform, time-independent electric field £. 

6.9 . Use the perturbation theory to evaluate the effect of a constant electric field E on the ground 
state energy E g of a hydrogen atom. In particular: 


51 See, e.g., CM Sec. 4.2. 
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(i) calculate the l st -order shift of Eg, 

(ii) bring the expression for the 2-order shift (neglecting the extended unperturbed states with E 
> 0) to the simplest possible analytical fonn, 

(iii) find the lower and upper bounds on the result, and 

(iv) discuss the simplest manifestations of the shift (called the quadratic Stark effect). 

6.10 . * A particle of mass m, with electric charge q, is in its ground 5-state with kn own energy E g 
< 0, being localized by a very short-range, spherically-symmetric potential well. Calculate its electric 
polarizability a. 

6.1 1 . In the first nonvanishing order of the perturbation theory, calculate the correction to 
energies of the ground state and all lowest excited states of a hydrogen-like atom/ion, due to electron’s 
penetration into its nucleus, modeling it as a spinless, unifonnly charged sphere of radius R « rfZ. 

6.12 . A spin- 14 particle is placed into a magnetic field 

3 = 3 z n z + 3 v n r , with 1 3 x | « 1 3, | . 

Calculate its energy levels: 

(i) exactly, and 

(ii) in the first nonvanishing order of the perturbation theory in small 3 X . 

Compare the results of the two approaches. 

6.13 . Use the perturbation theory to analyze the orbital diamagnetism. Namely, calculate the 
magnetic susceptibility % m of a dilute gas due to the orbital motion of a single electron confined inside 
each gas particle. 

Hint : You may like to use the well-known formula for the magnetic energy u per unit volume of 
a linear medium: 

u = 3 2 Hju, 

where 3 is the applied magnetic field, and // is the magnetic permeability, related to the susceptibility 
as n = tiX 1 + X„)- 52 


6.14 . * Analyze the statistics of the spacing S = E+ - E. between energy levels of a two-level 
system, assuming that all elements Hjj ■ of its Hamiltonian matrix (6.27) are independent random 
numbers, with equal and constant probability densities within the energy interval of interest. Compare 
the result with that for a purely diagonal matrix, with the similar probability distribution of the diagonal 
elements. 

6.15 . Discuss how to calculate the energy level degeneracy lifting in the second order of the 
perturbation theory, assuming that it is not lifted in the first order. Carry out such a calculation for a 
plane rotator of mass m and radius R, carrying electric charge q, and placed into a weak, uniform, 
constant electric field 3. 


52 See, e.g., EM Sec. 5.5, in particular Eqs. (5.127) and (5.112). 
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6.16 . Use the single-particle approximation to find the complex dielectric constant &{ co) of a 
dilute gas of similar atoms, due to their induced electric polarization by a weak external ac field, for a 
field frequency co very close to one of quantum transition frequencies co nn - defined by Eq. (85). 

Hint : In the single -particle approximation, atom’s response to an external field is described as 
that of Z similar, non-interacting electrons moving in an effective static attracting potential - generally 
induced not only by the nuclei but also by other electrons. 

6.17 . Use the solution of the previous problem to generalize the expression for the London 
dispersion force between two electroneutral molecules (whose calculation in the hannonic oscillator 
model was the subject of Problems 3.18 and 5.11) to the single-particle model with an arbitrary energy 
spectrum. 

6.18 . Use the solution of the previous problem to calculate the potential energy of interaction of 
two hydrogen atoms, both in their ground state, separated by distance r » r B . 

6.19 . In a certain quantum system, distances between three lowest 
energy levels are slightly different - see Fig. on the right (\^\ « coij). Find the 
time necessary to populate the first excited level almost completely (with a 
given precision s « 1), using the Rabi oscillation effect, if at t = 0 the system 
is completely in its ground state. 

Hint. Assume that all matrix elements of the perturbation Hamiltonian 
are known, and are all proportional to the external rf field amplitude. 

6.20 . A weak external force pulse F(t), of a finite time duration, is applied to a ID harmonic 
oscillator that initially was in its ground state. 

(i) Calculate, in the lowest nonvanishing order of the perturbation theory, the probability that the 
pulse drives the oscillator into an excited state. 

(ii) Fonnulate the condition of validity of the result, and compare it with the exact solution of the 
problem. 

(iii) Spell out the perturbative result for a Gaussian-shaped wavefonn, 

F(t) = F 0 exp{-r /r 2 }, 

and analyze its dependence on the effective duration t of the pulse. 

6.21 . A charged plane rotator, initially in its ground state, is placed into a spatially-uniform, but 
time-dependent external field £(j), applied at t = 0. 

(i) Calculate, in the lowest nonvanishing order in field’s strength, the probability that the pulse 
drives the rotator oscillator into its n th excited state. 

(ii) Spell out and analyze your results for a rotating field. 

(iii) Same for an ac field with fixed polarization. 


CK 


hco 2 =h(co j + £) 


hco , 
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6 . 22 . 

(i) Develop the general theory of excitations of the higher levels of a discrete-spectrum system, 
initially in the ground state, by a weak time-dependent perturbation, up to the 2 nd order. 

(ii) Apply the theory to the system analyzed in the previous problem (a plane rotator driven by a 
time-dependent electric field) to find out what excitations, forbidden in the 1 st order of the perturbation 
theory, are allowed in its 2 nd order. 

6.23 . A heavy, relativistic particle, with the electric charge q = Ze, passes by a hydrogen atom, 
initially in its ground state, with the impact parameter (shortest distance) b within the limits r B « b « 
r^ja, where a ~ 1/137 is the fine structure constant. Calculate the probability of atom’s transition to its 
lowest excited states. 

6.24. * A particle of mass m is initially in the localized ground state, with the known energy E g < 
0, of a very small, spherically-symmetric potential well. Calculate the rate of its delocalization 
(“ionization”) by an applied force F (t) = n/.AoCOSfeC with a time-independent orientation n/ , and discuss 
its dependence on frequency co. 

625 * Calculate the rate of ionization of a hydrogen atom, initially in its ground state, by a 
classical, linearly polarized electromagnetic wave with electric field’s amplitude and frequency co 
within the range 

h c 

Y «co« — , 

where r B is the Bohr radius. Recast your result in tenns of the cross-section of this electromagnetic wave 
absorption process. Discuss semi-quantitatively what changes would be necessary in the theory if either 
of the above conditions had been violated. 

6.26 . For the system of two weakly coupled quantum wells (see Fig. 13a), write the system of 
differential equations for the probability amplitudes a„ defined by Eq. (2.199), and in particular prove 
Eqs. (2.201) - which were just guessed in Sec. 2.7. 

He m m t 1 

621 . Use the quantum-mechanical Golden Rule to derive the general expression for the electric 
current / through a weak tunnel junction between two conductors, biased with dc voltage V, treating the 
conduction electrons as a Fermi gas, in which the electron-electron interaction is limited to the Pauli 
exclusion principle. Simplify the result in the low-voltage limit. 

Hint : The electric current flowing through a weak tunnel junction is so low that its perturbation 
of the electron states inside each conductor is negligible. 

6.28 . * Generalize the result of the previous problem to the case when a weak tunnel junction is 
biased with voltage V(t) = V + Acoscot , with Hco generally comparable with e\v\ and eA . 

6.29 . * Use the quantum-mechanical Golden Rule to derive the Landau-Zener formula (2.266). 
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6.30 . Calculate, in the 2 nd order of the perturbation theory, the rate Y of transitions between 
different states of a continuous group (of virtually the same energy E„), induced by a monochromatic 
perturbation of frequency co, with tico comparable to the distances between other, discrete levels of the 
system. 
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Chapter 7. Open Quantum Systems 

This chapter discusses the effects of interaction of a quantum system with its environment, and in 
particular, with the instruments used for measurements. Some part of this material is on the fine line 
between quantum mechanics and (quantum) statistical physics. Here I will only cover those aspects of 
this field which are of key importance for the basic goals of this course, in particular for the discussion 
of quantum measurements in the end of the chapter . 1 


7.1. Open systems and the density matrix 

All the way until the very end of the previous chapter, we have discussed quantum systems 
isolated from their environment. Indeed, from the very beginning we have assumed that we are dealing 
with the statistical ensembles of systems as similar to each other as only allowed by laws of quantum 
mechanics. Each member of such an ensemble, called pure or coherent, may be described by the same 
quantum state a - in the wave mechanics case, by the same wavefunction ¥«. Even our discussion of the 
Golden Rule in the end of the last chapter, in particular its fonn in which one component system (in Fig. 
6.13, system b) may be used as a model of the environment of another component (a), was still based on 
the assumption of a pure initial state (6.146) of the system. Since the interaction of two component 
systems was described by a certain Hamiltonian (the one given by Eq. (6.145) for example), for the state 
a of the system as a whole at arbitrary instant we might write 

k) = X a »M = 2Xk) 0 k)’ o 7 - 1 ) 

n n 

with a unique correspondence between eigenstates states n a and ///,. 

However, in many important cases our knowledge of quantum system’s state is incomplete. This 
is especially unavoidable 2 when a relatively simple quantum system s of our interest (say, an electron or 
an atom) is in a contact with environment e - here understood in a most general sense, say, as all the 
whole Universe less system s - see Fig. 1. Then there is virtually no chance of making two or more 
experiments with exactly the same composite system, because it would imply a repeated preparation of 
the whole environment (including the experimenter :-) in a certain quantum state - a rather challenging 
task, to put it mildly. In this case, it makes much more sense to consider a statistical ensemble of another 
kind, with random quantum states of the environment, though possibly with known macroscopic 
parameters (e.g., temperature, pressure, etc.). 

In classical physics, such mixed ensembles are the subject of statistical (classical) mechanics. 3 
Let us see how they may be described in quantum mechanics. For the beginning, we need to assume 


1 For a broader discussion of statistical mechanics and physical kinetics, including those of quantum systems, the 
reader is referred to the SM part of this lecture note series. 

2 Most of the mixed ensemble analysis in this chapter will pertain also to the cases when the systems of interest 
are not in a contact with the environment currently, and our knowledge about them is incomplete by some other 
reason - for example, if they had been in such a contact at some time between their perfect preparation (in a 
certain quantum state) and the observation, or if such a perfect preparation is impossible (or impracticable, or 
undesirable :-). 

3 See, e.g., SM Sec. 2.1. 
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again that the coupling between the system of interest and its environment is weak in the sense accepted 
in the perturbation theory. 4 In this case we can still use the bra- and ket-vectors of unperturbed states, 
that depend on different sets of variables (again, “belonging to different Hilbert spaces”)- Then the most 
general quantum state of the whole Universe, still assumed to be pure, 5 may be described as the 
following linear superposition: 



a )=Tj a j k 

j,k 

u® 

e *>- 


(7.2) 


Assumed 
quantum 
state of 
Universe 


The “only” difference between the description of such an entangled state and the superposition 
of separable states, described by Eq. (1), is that coefficients ajk in the right-hand part of Eq. (2) are 
numbered with two indices: index j listing the quantum states of system s, and k numbering the 
(enonnously large) set of quantum states of the environment. So, in a mixed ensemble a certain state Sj 
of the system of interest may coexist with different states of its environment. 6 Of course, the enormity of 
the Hilbert space of the environment, i.e. the number of k-components in sum (2), strips us of any 
opportunity to make direct calculations using that sum. For example, according to the basic Eq. (4.125), 
in order to find the expectation value of an arbitrary observable A in state (2), we would need to 
calculate 


{A) = {a \A \ «) = X a *k a j r { e a I ® { s j \^\ s r } ® I e e ) ' (7,3) 

jj' 

k,k' 

Even if we assume that { 5 } and {e} are sets of the basis states of, respectively, the system and the 
environment, and that each is full and orthonormal, Eq. (3) still includes a double sum over the 
enormous basis state set of the environment! 



However, let us consider a limited but the most important subset of operators - those of intrinsic 
observables, which depend only on the degrees of freedom of the system of interest ( 5 ). These operators 
commute do not act on environment’s degrees of freedom, and hence in Eq. (3) we may move the 
environment bra-vector (e*| over all the way to ket-vector | er). Assuming, again, that the set of 
environmental eigenstates is full and orthonormal, Eq. (3) is now reduced to 


4 In the opposite case, the very partition of the Universe into the system and the environment is impossible. 

5 Whether this assumption is true is an interesting issue, still being debated (more by philosophers than by 
physicists), but it is widely believed that its solution is not critical for the validity of the results of this approach. 
In Sec. 6, 1 will offer a strong argument for this opinion - albeit not its strict proof. 

6 Actually, such coexistence has been implied (but well hidden :-) in the derivation of the quantum-mechanical 
Golden Rule, which in all fairness, also belongs to the open systems class. 


Chapter 7 


Page 2 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 


Expectation 
value of 
intrinsic 
observable 


Density 

matrix: 

definition 


Statistical 

operator: 

definition 


< A ) = H a l a jr{ s j | M = H A jj'H 


a jk a rk . 


J>J 

k,k' 


(7.4) 


This is already some relief, because we have “only” a single sum over k, but the main trick 7 is 
still ahead. After the summation over k, the second sum in the last form of Eq. (4) is some function w of 
indices j and j \ so that, according to Eq. (4.96), this relation may be presented as 


W = ZV/; =Tr ( Aw ), 

jf 


(7.5) 


where matrix w, with elements 



i.e. Wjj . = Yj a jk a *?k > 

k 

k 


(7.6) 


is called the density matrix of the system. Most importantly, Eq. (5) shows that the knowledge of this 
matrix allows the calculation of the expectation value of any intrinsic observable A (and, according to 
Eqs. ( 1 .33)-(l .34), its r.m.s. fluctuation as well if necessary), even for the very general statistical 
ensemble of states (2). This is why let us have a very good look at the density matrix. 


First of all, as we know very well by now that the expansion coefficients in superpositions of the 
type (2) may be always expressed as bra-kets; in our current case, we may write 

a jk={ e k\®( s j \ a \ ( 7 - 7 ) 


Plugging this expression into Eq. (6), we get 


H V = \~j 


S ; 


Z( e *l a X a l e * 

V k 




( 7 -8) 


We see that from the point of our system (i.e. in its Hilbert space whose basis states may be numbered 
by indices j only), the density matrix is indeed just the matrix of some construct, 8 


^ s Z( e i-l a X«k)’ 

k 


(7. 9) 


that is called the statistical (or “density”) operator. As evident from its definition (9), in contrast to the 
density matrix this operator does not depend on the choice of a particular basis sj - just as all previous 
operators considered in this course, but in contrast to them, the statistical operator does depend on 
composite system’s state a, including the state of system 5 as well. Flowever, in the /-space it is 
mathematically still an operator whose matrix elements obey all formulas of the bra-ket formalism. 

In particular, due to its definition (6), the density operator is Hermitian: 


w jf 


■ = Yj a *k a fk = Yj a n a *k = w jp 


(7- 10) 


7 First suggested in 1927 by J. von Neumann. 

8 Of course the “bra-kets” in this expression are not c-numbers, because state a is defined in a larger Hilbert space 
(of the environment plus the system of interest) than the basis states e* (of the environment only). 
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so that according to the general analysis of Sec. 4.3, there should be a certain basis {w} in which the 
matrix of this operator is diagonal: 


w. 


= W;8 ir . 


Since any operator, in any basis may be presented in form (4.59), in basis {vv} we may write 



(7.11) 


(7.12) 


Statistical 
operator in 
diagonalizing 
basis 


This expression reminds, but is not equivalent to Eq. (4.44) for the identity operator, that has been used 
so many times in this course, and in the basis wy has the form 

/= Zb}H- < 7 - 13 > 


In order to comprehend the meaning of coefficients wy participating in Eq. (12), let us use Eq. (5) 
to calculate the expectation value of any observable A whose eigenstates coincide with those of the 
special basis set {w}: 


(A) = Tr (Aw) = X A m' w At = Tj A j w j 

jf j 


(7.14) 


Expectation 
value of 
^-compatible 
variable 


where Aj is just the expectation value of observable A in state wy. Hence, in order to comply with the 
general Eq. (1.37), real c-numbers Wj must have the physical sense of probabilities Wj of finding the 
system in state j. As the result, we can rewrite Eq. (12) in the form 

™='T\ w j) w j( w j\- ( 7 - 15 ) 

j 

In one ultimate case when only one of probabilities (say, Wj-) is different from zero, 

Wj=S jr , (7.16) 


the system is evidently in a coherent (pure) state Wj-. Indeed, it is fully described by one ket-vector wy-}, 
and we can use the general rule (4.86) to present it in another (arbitrary) basis { 5 } as a coherent 
superposition 

/ / 


where U is the unitary matrix of transform from basis {vv} to basis { 5 }. According to Eqs. (11) and (16), 
in such a pure state the density matrix is diagonal in the {vv} basis, 


W: 


~ A i.r A r. r 


(7.18a) 


but not in an arbitrary basis. Indeed, using the general rule (4.92), we get 

V u\w,„ I ■ U V ; = utu = U*„U 


W ; • = 

jj | in s 


/,/' 


(7.18b) 


To make this result more transparent, let us denote matrix elements U/j = (wy -j-sy) (that, for fixed 
j”, depend on just one index j) by ey; then 
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Density 
matrix 
in a pure 
state 


Gibbs 

distribution 


W 


JJ I mi 


= a jd ., 


(7.19) 


2 

so that N elements of the whole NxN matrix is determined by just one string of N c-numbers ctj. For 
example, for a two-level system (N= 2), 


(7.20) 



( * 


1 

a x a x 

ol 1 oc x 

| in s 

* 



CtyCC 2 

oc 2 oc 2 j 


We see that the off-diagonal terms are, colloquially, “as large as the diagonal ones”, in the following 
sense: 


(7.21) 

Since the diagonal terms have the sense of probabilities Wi ,2 to find the system in the corresponding 
state, we may present Eq. (20) in the form 


w = 




(W l W 2 ) in e i<p 


K (W l W 2 ) 1/2 e 1<P W 2 


(7.22) 


The physical sense of the (real) constant cp is the phase shift between the coefficients in the linear 
superposition (17) that presents the pure state Wj- in basis s ip. 

Now let us consider a different statistical ensemble of two-level systems, that includes member 
states identical in all aspects (including similar probabilities W\p in the same basis .s' i , 2 ), besides that the 
phase shifts cp are random, with the phase probability uniformly distributed over the trigonometric circle. 
Then the ensemble averaging is equivalent to averaging over cp from 0 to 2 n, so that it kills the off- 
diagonal terms of the density matrix (22), and the matrix becomes diagonal. For a system with a time- 
independent Hamiltonian, such averaging is especially plausible in the basis of stationary states n of the 
system, in which phase cp is just the difference of integration constants in Eq. (4.158), and randomness is 
naturally produced by minor fluctuations of the energy difference E\ - E 2 . (In Sec. 3 we will study the 
dynamics of such dephasing process.) The mixed statistical ensemble of systems with the density matrix 
diagonal in the stationary state basis is called the classical mixture, and presents the limit opposite to the 
pure (coherent) state. 

After that example, the reader should not be much shocked by the main claim 9 of statistical 
mechanics that any large ensemble of similar systems in thermodynamic (or “thermal”) equilibrium is 
exactly such a classical mixture. Moreover, for systems in the thermal equilibrium with a much larger 
environment with fixed temperature T (such environment is usually called a heat bath or a thermostat) 
statistical physics gives 10 a very simple expression, called the Gibbs distribution, for probabilities W„: 



\ 


l K T\ 



(7.23a) 


9 This is essentially an alternative formulation of the basic postulate of statistical physics, called the 
microcanonical distribution - see, e.g., SM Sec. 2.2. 

10 See. e.g., SM Sec. 2.4. The Boltzmann constant k B is only needed if temperature is measured in non-energy 
units, say in kelvins. 
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where E n is the eigenenergy of the corresponding stationary state, and Z is the normalization coefficient 
called the statistical sum 


z = X ex pj~ 


n 



(7.23b) 


A detailed analysis of classical and quantum ensembles in thermodynamic equilibrium is the 
focus of statistical physics courses (such as my SM) rather than this course of quantum mechanics. 
However, I would still like to attract reader’s attention to the key fact that, in contrast with the similarly- 
looking Boltzmann distribution for single particles, 11 the Gibbs distribution is absolutely general and is 
not limited to classical statistics. In particular, for quantum gases of indistinguishable particles, it is 
absolutely compatible with quantum statistics (such as the Bose-Einstein or Fermi-Dirac distributions) 
of the component particles. For example, if we use Eq. (23) to calculate the average energy of a ID 
harmonic oscillator of frequency ay in thermal equilibrium, we easily get 12 


(7.24) 

(7.25) 
(7.26a) 


An alternative way to present the last result is to write 





h( 0 o (n). 


with (nj 


1 

exp{/j&) 0 /k B r}-l ’ 


(7.26b) 


and to interpret it as the fact that in addition to the so-called zero-point energy fi a>o/2 of the ground state, 
the oscillator (on the average) has (n) thermally-induced excitations, with energy hay each. In the 
harmonic oscillator, whose energy levels are equidistant, such a language is completely appropriate, 
because the transfer from any level to one just above it adds the same amount of energy, hay, to the 
system. The above expression for (n) is actually the Bose-Einstein distribution (for the particular case of 
zero chemical potential); 13 we see that it does not only contradict the Gibbs distribution (for the total 
energy of the system), but immediately follows from it. 14 


11 See, e.g., SM Sec. 2.8. 

12 See, e.g., SM Sec. 2.5 - but mind a different energy reference level, Eq = Tico, used in Eqs. (2.68)-(2.69), 
affecting the expression for Z. Actually, the calculation is so straightforward (just the summation of a geometric 
progression for the enumeration of Z) that it is highly recommended to the reader as a simple exercise. 

13 See, e.g., SM Sec. 2.8. 

14 Because of the fundamental importance of Eq. (26) for many fields of physics, let me remind the reader of its 
main properties. At low temperatures, k B T « hay, there are virtually no thermal excitations, (n) — > 0, and the 
average energy of the oscillator is dominated by that of its ground state. In the opposite limit of high temperatures, 
(n) — > k B T /hay » 1, and (E) approaches the classical value k B T (following, for example, from the equipartition 
theorem, which assigns energy k B T/2 to each quadratic contribution to system’s energy - in the ID oscillator case, 
to one potential and one kinetic energy term). 


Harmonic 
oscillator 
in thermal 
equilibrium 
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12. Coordinate representation and the Wigner function 

For many applications of the density matrix to wave mechanics, its coordinate representation is 
convenient. (I will only discuss it for ID case; the generalization to multi-dimension case is 
straightforward.) Following Eq. (4.47), it is natural to define the following function of two arguments 
(frequently also called the density matrix ): 

Density 
matrix in 
coordinate 
representation 

Inserting, into the right-hand part of this definition, two closure conditions (4.44) for an arbitrary (full 
and orthonormal) basis {s}, and then using Eq. (5.19), we get 15 

w(x,x') = X( x b)( 5 / !%)(*/ I*'} = • ( 7 - 28 ) 

j»f jJ' 

In the special basis {w}, in which the density matrix is diagonal, this expression is reduced to 

w(x,x f ) = ^/',(xwy : (x ') . (7.29) 

j 

Let us discuss the properties of this function. At coinciding arguments, x = x this is just the 
probability density: 16 

w(x, x) = £ y/j (x)WjY* (x) = Yj w j (x)Wj = w(x ) . (7.30) 

j j 

However, the density matrix gives more information about the system than just the probability density. 
As the simplest example, let us consider a pure quantum state, with Wj = Sjj •, so that yAx) = y/j(x), and 

w(x,x’) = y/ r (x)yr*(x') = y/(x)y/*(x') . (7.31) 

We see that the density matrix carries the information not only about the modulus but also the phase of 
the wavefunction. (Of course one may argue rather convincingly that in this ultimate limit the density- 
matrix description is redundant, because all this information is contained in the wavefunction itself.) 

How may be the density matrix interpreted? In the simple case (3 1), we can write 

|w(x,x')|~ = w(x,x')w (x,x') = y/{x)y/ ( x)y/(x')y / (x') = w(x)w(x') , (7.32) 

so that the modulus squared of the density matrix may is just as the joint probability density to find the 
system at point x and point x’. For example, for a simple wave packet with the spatial extent Sx, w(x,x ’) 
is appreciable only if the both points are not farther than Sx from the packet center, and hence from each 
other. The interpretation becomes more complex if we deal with an incoherent mixture of several 
wavefunctions, for example the classical mixture describing the thermodynamic equilibrium. In this 
case, we can use Eq. (23) to rewrite Eq. (29) as follows: 


w(x,x') = (x|w|x' 


(7.27) 


15 For now, I will focus on a fixed time instant (say, t = 0), and hence write yAx) instead of T(x, t). 

16 This fact is the historic origin of density matrix’ name. 
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w(x,x') = Y J V' n (xW,y*(x') 

n 


=^2>«W ex p 


k B T 


W n (X') . 


(7.33) 


As the simplest example, let us see what is the density matrix of a free (ID) particle in the 
thermal equilibrium. As we know very well, in this case, the set of energies E p = p 12 m of stationary 
states (monochromatic waves) forms a continuum, so that we need to replace sum (33) by an integral, 
taking “delta-normalized” traveling wavefunctions (5.59) as eigenstates: 


w(x,x') 


1 

iTltlZ 




2mk B T 


exp 


ipx 


\dp. 


(7.34) 


This is a usual Gaussian integral, and may be worked out, as we have done repeatedly in Chapter 2 and 
beyond, by complementing the exponent to the full square of momentum plus a constant. The statistical 
sum Z may be also readily calculated, 17 

Z = (2rnnk B T) 112 , (7.35) 


However, for what follows it is more useful to write the result for product wZ (the so-called un- 
normalized density matrix ): 



(7.36) 


This is a very interesting result: the density matrix depends only on the difference of its 
arguments, dropping to zero fast as the distance between points x and x’ exceeds the following 
characteristic scale (called the correlation length ) 



(7.37) 


This length may be interpreted in the following way. It is straightforward to use Eq. (23) to verify that 
the average energy E p = p 12m of a particle in the thermal equilibrium, i.e. in the classical mixture (33), 
equals k^TI2 - this is just one more manifestation of the equipartition theorem. Hence the average 
momentum magnitude may be estimated as 


Pc=[P 


1/2 


(2 m(E)) U ~ =(mk B T) V2 , 


(7.38) 


so thatx c is of the order of the minimal length allowed by the Heisenberg-like “uncertainty relation”: 


x 


c 



(7.39) 


17 Due to the delta-normalization of the eigenfunction, the density matrix for the free particle (and any system 
with continuous eigenvalue spectrum) is normalized as 

+00 +00 

J w(x,x')Zdx' = J w{x,x')Zdx = 1. 

—00 -00 


Free 
particle 
in thermal 
equilibrium 


Free 

particle’s 

correlation 

length 
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Notice that with the growth of temperature, the correlation length (37) goes to zero, and the 
density matrix (36) tends to the ^-function: 

w{x,x')z\ T ^ x S(x-x') . (7.40) 


Since in this limit the average kinetic energy of the particle is larger than its potential energy in any 
fixed potential profile, Eq. (40) is the general property of the density matrix (33). 


Let us discuss the following curious feature of Eq. (36): if we replace k B T with h/i(t - to), and x’ 
with xo, the un-normalized density matrix wZ for a free particle turns into the particle’s propagator - see 
Eq. (2.49). This is not just an occasional coincidence. Indeed, in Chapter 2 we saw that the propagator of 
a system with an arbitrary stationary Hamiltonian may be expressed via the stationary eigenfunction as 

G(x, t;x 0 ,t 0 ) = Y J V'„ O) ex P) “ - *o ) \ V*n Oo ) • (7.4 : 1 ) 


Comparing this expression with Eq. (33), we see that the replacements 


h kfT ’ 


(7.42) 


turn the pure-state propagator G into the un-normalized density matrix wZ of the same system in 
thermodynamic equilibrium. This important fact, rooted in the formal similarity of the Gibbs distribution 
(23) with the Schrodinger equation’s solution (1.67), enables a theoretical technique of the so-called 
thermodynamic Green ’s functions, which is especially productive in condensed matter physics. 18 


For our purposes, we can use Eq. (42) to recycle some of wave mechanics results, in particular 
the following formula for the hannonic oscillator’s propagator 


G(x, t; x 0 ,t 0 ) = 


f \ 

mco 0 

1,2 1 

exp< 

mco 0 

(x 2 + x 2 )cos[n> 0 (t -*„)]- 2xx 0 }} 

v 27tih sin[ty 0 ( t - 1 0 )] ) 

2ihsin[co 0 (t -t 0 )\ \ 


(7.43) 


that may be readily proved to satisfy the Schrodinger equation for Hamiltonian (5.95), with the 
appropriate initial condition, G(x, to; x 0 , to) = S(x - xo). Making substitution (42), we immediately get 

Harmonic 
oscillator 
in thermal 
equilibrium 

As a sanity check, at very low temperatures, kfT « hcoo, both hyperbolic functions, participating in this 
expression, are very large and nearly equal, and Eq. (44) yields 


w(x,x’)Z = 

r \ 

ma) 0 

^ _ [ mco Q (x 2 + x' 2 )cosh[/z<u 0 / k B T] - 2xx' 


, 27th sinhf/my, / k B T] , 

2hsmh[hco 0 lk B T] 

\ 


(7.44) 


w(x,x')Z\ 


7A> 0 


mco 0 

7th 


, 1/4 


expl 


mco 0 x 


x ex p i 


hco 0 

2 


f y /4 
mco 0 ' 


7th 


exp 


ma> 0 x 


, 2 


. (7.45) 


18 I will have no time to discuss this technique, and have to refer the interested reader to special literature. 
Probably, the most famous text of that field is A. Abrikosov, L. Gor’kov, and I. Dzyaloshinski, Methods of 
Quantum Field Theory in Statistical Physics, Prentice-Hall, 1963. (Later reprintings are available from Dover.) 
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In each of the square brackets we can readily recognize the ground state’s wavefunction (2.269), while 
the middle exponent is just the statistical sum (24) in the low-temperature limit when it is dominated by 
the ground-level contribution: 

(7.46) 


r->o 


ex pi 


hco 0 

2kJ 


As a result, Z in both parts of Eq. (45) may be cancelled, and the density matrix in this limit is described 
by Eq. (3 1), with the ground state as the only state of the system. This is natural when temperature is too 
low for the excitation of any other state. 

Returning to arbitrary temperatures, Eq. (44) in coinciding arguments gives the following 
expression for the probability density: 19 


. 1/2 


w(x, x)Z = w(x)Z = 


met),. 


exp 


2;z7zsinh [ha> 0 / k B T] 

This is just a Gaussian function ofx, with the following variance: 


ma> ° X t anh h( ° 0 

H 2 k B T 


x 2 ) = 


h -coth-' 1 ' 8 ” 


2 m co,. 


2k J 


(7.47) 


(7.48) 


In order to compare this result with our earlier ones, it is useful to recast it as 


(£/) = 


mco: 


x 2 ) = 


kco n , fico n 
— - coth - 0 


2 k B T 


(7.49) 


Comparing this expression with Eq. (26), we see that the average value of potential energy is exactly 
one half of the total energy - the other half being the average kinetic energy. This is what we could 
expect, because according to Eqs. (5. 129)-(5. 130), such relation holds for each Fock state and hence 
should also hold for their classical mixture. 

Unfortunately, besides the trivial case (30) of coinciding arguments, it is hard to give a 
straightforward interpretation of the density function in terms of system measurements. This is a 
fundamental difficulty that has been well explored in terms of the Wigner function (sometimes called the 
“Wigner-Ville distribution”) 20 defined as 

Wigner 

17 501 function: 
' ’ ' definition 


2m J 

' x jn 

x + -,x--y w 

iPX 

{ h j 

>dX. 


19 I have to confess that this notation is imperfect, because from the point of view of rigorous mathematics, w(x, 
x ’) and w(x) are different functions, and so are w(p, p ’) and w(p) used below. In the perfect world, I would use 
different letters for them all, but I desperately want to stay with “w” for all the probability densities, and there are 
not so many good different fonts for this letter. Let me hope that the difference between these functions is clear 
from their arguments, and from the context. 

20 It was introduced in 1932 by E. Wigner on the basis of a general ( Weyl-Wigner ) transform suggested by H. 
Weyl in 1927, and re-derived in 1948 by J. Ville on a different mathematical basis. 
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From the mathematical standpoint, this is just the Fourier expansion of the density matrix in one of two 
new coordinates (Fig. 2) defined by relations 

x = X + — , x' = X-—. (7.51) 

2 2 

Physically, the new argument X = (x + x ’)/2 may be understood as the average position of the 
particle during the time interval ( t - t ’), while X = x-x' as the distance passed by the particle during 
that time interval, so that P may be interpreted as the characteristic momentum of a particle during that 
motion. As a result, the Wigner function is a construct intended to characterize the system spread 
simultaneously in the coordinate and momentum space - for ID systems, on the phase plane [X, P ] that 
we considered before - see Fig. 5.6. Let us see how fruitful these intentions are. 

x' 


Fig. 7.2. Coordinates X and X employed in the Weyl- 
Wigner transform (50). They differ from the coordinates 
obtained by the rotation of the reference frame by angle 
n!2 only by coefficients 72, describing scale stretching. 


An 

V2 \ 


.0 


Xj2 


x 


First of all, we may write the Fourier transfonn reciprocal to Eq. (50): 


H 


X + X-,X--\ = 

2 2 


j>(X,P)expj + — \dP. 


(7.52) 


For the particular case X = 0 , this relation yields 

w(X) = w(X,X) = ^W(X,P)dP. (7.53) 

Hence the integral of the Wigner function over momentum P gives the probability density to find the 
system at point X. 

Actually, the function has the same property for integration over X. To prove that, we should 
first introduce the momentum representation of the density matrix, in the full analogy with its coordinate 
representation (27): 

w(p,p f ) = (p\w\p'). (7.54) 


Inserting, as usual, two identity operators, in the form given by Eq. (5.2 1), into the right hand part of this 
equality, we can get the following relation between the momentum and coordinate representations: 


w(P,P') = {p\^\p') 


J J dxdx'(p\x)(x\w\x')(x'\ p') 


J J(irJx'exp|-^-|w(x,x')exp< >-(7. 55) 
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This is of course nothing else than the unitary transform of an operator from the x-basis to /?-basis, and is 
similar to the first fonn of Eq. (5. 67). 21 For coinciding arguments,/! =p ’ , Eq. (55) is reduced to 


w(p ) = w(p,p ) = — — [ [ dxdx'w(x,x') exp] 
2 7ih J * 


ip(x-x') 

n . 


(7.56) 


Using Eq. (29) and then Eq. (5.60), this function may be presented as 


* p )= 4 ^ l w j J J dxdx V j(x)y/* (x) exp-j 


2 7th 


ip(x-x') 

h 


YWj<Pj{p)<P*{p), (7.57) 


and hence interpreted as the probability density of the particle’s momentum at point p. Now, in variables 
(51), Eq. (56) has the form 


w(p) =^H x+ ^’ x - 


x_ 

T 


expi 




dXdX. 


(7.58) 


Comparing this equality with definition (50) of the Wigner function, we see that 

w(P) = fw(X,P)dX. 


(7.59) 


Thus, according to Eqs. (53) and (59), the integrals of the Wigner function over either the 
coordinate or momentum give the probability densities to find them at certain values of these variables. 
This is of course the main requirement to any candidate joint probability density, p(X,P), to find a 
classical representation point of a stochastic system on the phase plane [X, P ]. 22 

Let us look how does the Wigner function look for the simplest systems in the thermodynamic 
equilibrium. For a free ID particle, we can use Eq. (34), ignoring for simplicity the normalization issues: 


i 

W(X,P ) oc J exp-j 


mk H TX- 

2 n 2 


cxp< 


iPX \ 

* 


dX. 


(7.60) 


The usual Gaussian integration yields: 


W ( X , P) = const x exp 


P 2 

2 mk B T 


(7.61) 


We see that the function is independent of X (as it should be for this translational-invariant system), and 
coincides with the Gibbs distribution (23). We could get the same result directly from classical statistics. 
This is natural, because as we know from Sec. 2.2, the free motion is essentially not quantized - at least 
in terms of its energy and momentum. 


Now let us consider a substantially quantum system, the harmonic oscillator. Plugging Eq. (44) 
into Eq. (50), for that system in thermal equilibrium it is easy to show (and hence is left for reader’s 
exercise) that the Wigner function is also Gaussian, but now in both its arguments: 


21 Note that the last line of Eq. (5.67) is invalid for the density operator w , because it is not local! 

22 Such density, which would express the probability dW to find the system in a small area of the phase 
plane as dW= p{X, P)dXdP , is the basic notion of (ID) classical statistics - see, e.g., SM Sec. 2.1. 
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W ( X , P ) = const x expt - C 


mcOgX 2 


+ ■ 


2 m 


(7.62) 


though coefficient C is now different from Mk^T , and tends to that limit only at high temperatures, k/jT 
» tioX). Moreover, for the Glauber state it also gives a very plausible result - a Gaussian distribution 
similar to Eq. (62), but shifted to the central point of the state - see Sec. 5. 5. 23 

Unfortunately, for some other possible states of the harmonic oscillator, e.g., any pure Fock state 
with n > 0, the Wigner function takes negative values in some regions of the [. X . , P] plane - Fig. 3. 24 





Fig. 7.3. The Wigner function of several Fock states of a 
harmonic oscillator: (a) n = 0, (b) n = 1; (c) n = 5. Adapted 
from http://en.wikipedia.org/wiki/Wigner function . 


The same is true for most other quantum systems. Indeed, this fact could be predicted just by 
looking at definition (50) applied to a pure quantum state, in which the density function may be factored 
- see Eq. (31): 


23 Please note that in notations of that section, arguments {X, P) of the Wigner function should be replaced with 
{x,p}, and capital letters saved for the Cartesian coordinates of the central point (5.133), i.e. the classical complex 
amplitude of the oscillations. 

24 Spectacular experimental measurements of this function (for n = 0 and n= 1) were carried out recently by E. 
Bimbard et al., Phys. Rev. Lett. 112 , 033601 (2014). 
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Changing argument P (say, at fixed X), we are essentially changing the spatial “frequency” 
(wavenumber) of the wavefunction product’s Fourier component we are calculating, and we kn ow that 
Fourier images typically change sign as the frequency is changed. Hence the wavefunctions should have 
some high-symmetry properties to avoid this effect. Indeed, the Gaussian functions (describing, for 
example, the Glauber states, and as the particular case, the ground state of the harmonic oscillator) have 
such a symmetry, but many other functions do not. 

Hence the Wigner function cannot be used in the role of classical probability density p(X, P ), 
otherwise we would get a negative probability for measurement in certain intervals dXdP - the notion 
hard to interpret. However, the Wigner function is still used for a semi-quantitative interpretation of 
states of open quantum systems. 


W(X,P) = -±-f v ,( X + ^)y\x - *) expj - ■ ^ \dX . 


7.3. Open system dynamics: Dephasing 

So far we have discussed the density matrix as something given. Now let us discuss the 
evolution of the matrix in time, starting from the simplest case when the system is in state (15) with 
time-independent probabilities Wj. In the Schrodinger picture we can rewrite Eq. (15) as 

w(0 = £lx(Oy j (w J (t)\. (7.64) 

i 

Differentiating this equation by parts, and using Eqs. (4. 1 57)-(4. 158), with the account of the Hermitian 
nature of the Hamiltonian operator, we get 


ihw = ifiYj | Wj (t)} W j { w j (0 1 + 1 w j ifi)Wj ( Wj (t) |] = 2 [ftj Wj ( t))Wj ( Wj (t) 

j j 

=^Y\ Wj ( i )) w j ( w j (o | - Z I w j (OK (■ Wj (o | h. 

j j 


Wj{t)) Wj (wj(t)\H 

(7.65) 


Now using Eq. (64) again (twice), we get the so-called von Neumann equation 25 

von Neumann 
( / -66) equation 

This equation is similar in structure to Eq. (4.199) describing the time evolution of the Heisenberg- 
picture operators: 



- 


ihw = 

H,w 



ifiA = 


AM 


(7.67) 


besides the operator order in the commutator, i.e., the sign of the right-hand part. This is quite natural, 
because Eq. (66) belongs to the Schrodinger picture, while Eq. (67) to the Heisenberg picture of the 
quantum dynamics. 


25 In many texts, it is called the “Liouville equation”, due to the philosophical proximity to the classical Liouville 
theorem for the distribution function p(X, P) or its multi-dimensional analog - see, e.g., SM Sec. 6.1, in particular 
Eq. (6.5). 
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In the general case when a system, initially out of equilibrium, comes into a contact with the 
environment, probabilities Wj change, and dynamics is described by equations more complex than Eq. 
(66). However, we still can use this equation to discuss, using a simple model, the second (after the 
energy relaxation) major effect of the environment, dephasing (also called “decoherence”). 26 Let us 
consider the following model of a system interacting (weakly!) with environment: 27 


/V /V /V ( -» /V 

h = h, + h,{*}+h m . 

Let us consider the simplest, two-level system, taking its Hamiltonian in the simplest form, 


(7.68) 


H, =a. 


cr_ 


(7.69) 


(as we kn ow from Sec. 4.6, such Hamiltonian is sufficient to avoid the energy level degeneracy), and a 
factorable (bilinear) interaction - cf. Eq. (6.148) and its discussion: 

W,„=-/hR. (7.70) 


Here / is a Hermitian operator depending only on the set {A} of environmental degrees of freedom 
(“coordinates”). These coordinates belong to the Hilbert space different from that of the two-level 
system, and hence operators /{/l} and H e {/.} (that describes the environment) commute with <r z - and 

any other intrinsic operator of the two-level system. Of course, any realistic H e {/.} is very complex, so 
that it may be surprising how much we will be able to achieve without specifying it. 

Before we proceed to solution, let me remind the reader of the important two-level systems that 
may be described by this model. The first example is an electron in an external magnetic field of a fixed 
direction (taken for axis z), which includes both an average component (3„)and a random (fluctuating) 

component 3 z . As it follows from the discussion in Chapter 4, it may be described by Hamiltonian (68)- 
(70) with 

«z=Ab(^z), ~f = M b^z- (7-71) 

The second important example is a particle in a double-quantum-well potential (Fig. 4), with a 
barrier between them sufficiently high to be impenetrable, and an additional force F(t) exerted by the 
environment. If the force is sufficiently weak, we can neglect its effects on the shape of quantum wells 
and hence on the localized wavefunctions i // l , r , so that the force effect is reduced to the variation of the 
difference E L — E R = F(t) Ax between well eigenenergies. As a result, it may described by Eqs. (608)- 
(70) with 

a. ~ (f)Ax/ 2; ~ f ~ FAx/2. (7.72) 


26 Another example when W) are constant in time, and hence Eq. (66) is valid, is the thermodynamic equilibrium. 
However, in this case the statistical operator is diagonal in the stationary state basis and hence commutes with the 
Hamiltonian. Hence the right-hand part of Eq. (66) vanishes, and it shows that the density matrix does not evolve 
in time at all - as it should. 

27 Though this model works very well in many cases (see the examples given below), it is not adequate for a 
particle interacting with the environment of similar particles. In this case the methods discussed in the next 
chapter are more relevant. 
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U s (x) - F(t )x 


Fig. 7.4. Dephasing in a double quantum 
well system. 


Returning to the general model (68)-(70), let us start its analysis from writing the usual equation 
of motion for the Heisenberg operator & z : 28 


ih& : 


M]=(a : - f){a z ,<7 __]=0, 


(7.73) 


so that operator does not evolve in time. What does this mean for the observables? For an arbitrary 
density matrix of the two-level system, 


Wn 

w 12 " 

W 2 1 

W22, 


(7.74) 


we can readily calculate the trace of operator a z (since operator traces are basis - independent, we can 
do this in any basis, in particular in the usual z-basis): 


T \'{(7 Z w) = Tr(a __w) = Tr 


7 1 

0 " 


W 12^| 

1 

0 

-7 

V W 21 

w 22 )_ 


= Wn - w 22 =W 1 -W 2 


(7.75) 


Hence, according to Eq. (5), <t_ may be considered the operator for observable W\ - W 2 , so that 
in the case (73), the difference W\ - W 2 does not depend on time, and since the sum of the probabilities 
is also fixed, W\ + W 2 = 1, both of them are constant. (The physics of this result is especially clear for 
the model shown in Fig. 4: since the potential barrier separating the quantum wells is so high that 
tunneling through it is negligible, the interaction with environment cannot move the system from well 
into another one. It may look like nothing interesting may happen in such situation, but in a minute we 
will see this is not true.) Hence, we may use the von Neumann equation (66) for the density matrix 
evolution (in the Schrodinger picture). In the usual z-basis: 


ihw = ih 

= k ~f 


r w n w l2 A 


V w 21 W 22 j 


[H, w] = (a z -/)[o z ,w] 


1 0 
0 -1 


v w n w 12 a 


V W 21 W 22 J 


w u w 12 


V W 21 W 22j 


1 0 
0 -1 


= k-/) 


A 0 

v - 2 w. 


2 w. 


(7.76) 


12 


21 


28 This can be done because we may consider the whole system, including the environment, as a Hamiltonian one 
- see Eq. (68). 


Chapter 7 


Page 16 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 


This means that though the diagonal elements, i.e., the probabilities of the states, do not evolve in time 
(as we already know), the off-diagonal coefficients do change; for example, 

ifiw n = 2{a z - f)w l2 , (7.77) 


with a similar but complex-conjugate equation for W 21 . The solution of the linear differential equation 
(77) is straightforward, and yields 


w l2 (; t ) = w n (0) exp \ exp] i \ | f(t')dt' 1 . 


ft 


ft ■ 


(7.78) 


The first exponent is a detenninistic c-number factor, while in the second one f(t) = is still an 

operator in the Hilbert space of the environment, and, from the point of view of the system of our 
interest, a random function of time. 

Let us start from the limit when the environment behaves classically. 29 In this case, the operator 
in Eq. (78) may be considered as a classical random function of time /(/), provided that we average the 
result over the ensemble of many functions ff) describing many (macroscopically similar) experiments. 
For a small time interval t = dt — » 0, we can use the Taylor expansion of the exponent, truncating it after 
the quadratic tenn: 


r\ dt r\ dt dt r\ dt dt 

= i+/,{ {m)d? -- y j> J dt"{mf{o) =i-—\dt’\ dt"K f (f - 1"). 

n 0 0 0 ^00 


f ~ dt 


\ 


( ~ dt 


r\ Ul \l 

i-\mdr i-\nt")dt" 

v n 0 A n 0 j 


(7.79) 



Here we have used the fact that the first average is equal to zero (it is evident from Eqs. (69)-(70) that if 
/ had any average component, it could be included into parameter a), while the second average, called 
the correlation function, in a statistically- (i.e. macroscopically-) stationary state of environment may 


Correlation 
function of 
classical 
variable 


only depend on the time difference r =t’-t”: 

{f(Of(t")) = K f (t'-t") = K f (T). 


(7.80) 


If this difference is much larger than some time scale r c , called the correlation time of the random force, 
the values f[t’) and fit") are completely independent ( uncorrelated ), as illustrated in Fig. 5a, so that the 
correlation function has to tend to zero. On the other hand, at r = 0, i.e. t’ = t”, the correlation function 
is just the variance of f. 


K,( 0) = (/ 2 ), 


(7.81) 


and has to be positive. As a result, the function looks (qualitatively) like the sketch in Fig. 5b. 


29 This assumption is not in any contradiction with the quantum treatment of the two-level system, because a 
typical environment has very dense energy spectrum, so that the distances between them may be readily bridged 
by thermal excitations of energies ~ k B T « 2 a z , often making its essentially classical. 


Chapter 7 


Page 17 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 



Fig. 7.5. (a) Typical random 
process and (b) its correlation 
function - schematically. 


Hence, if we are only interested in time differences r much longer than r c , we may approximate 
Kj( r) with a delta-function. Let us take it in the following convenient fonn 


K f (r)«n 2 D 9 S(T), 


(7.82) 


where D (p is a positive constant called the phase diffusion coefficient. The origin of this term stems from 
the very similar effect of diffusion of atoms or small solid particles in real space - the so-called (the 
Brownian motion . 30 Indeed, if a small classical particle moves in a highly viscous medium, its velocity 
is approximately proportional to the external force. Hence, if the random hits of a ID particle by the 
molecules may be described by a force which obeys a law similar to Eq. (82), the velocity (along any 
Cartesian coordinate) is also delta-correlated : 

(v(0) = 0, (v(t')v(t")) = 2 DS(t' - 1"). (7.83) 


Now we can integrate the kinematic equation x = v, to calculate particle’s deviation from the initial 
position, 

t 

x(t) - x(0) = | v(t')dt', (7.84) 

0 


and its the variance: 

jt t \ t t t t 

({x (t) - x(0)) 2 W J v(t')dt'j V(t")dt" j = J dt'\ dt"{v(t')v(t")) = J dt'\ dt"2DS(t' - 1") = 2 Dt. (7.85) 

\o 0 / 0 0 00 

This is the famous law of diffusion, showing that the r.m.s. deviation of the particle from the initial point 

grows with time as (2 Dt) , where constant D is called the diffusion coefficient. 

Returning to the diffusion of the quantum-mechanical phase, using Eq. (82), the last double 
integral in Eq. (79) yields fcDydt, so that 

(w n (dtfj = H , 12 (0)exp|-/-^ £ -J?|(l -2D v dt). (7.86) 

Applying this formula to sequential time intervals, 


30 The theory of this phenomenon, first observed experimentally by biologist R. Brown in the early 1800s, was 
pioneered by A. Einstein in 1905 (see in particular Eq. (206) below) and developed in detail by M. Smoluchowski 
in 1906-1907, and A. Fokker in 1913. 


Phase 

diffusion 

coefficient 
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Two-level 

system’s 

dephasing 


w u (2dt)) = (w l2 (<i/))expj - i^^dt l(l - 2D rp dt) = w 12 (0)exp<! -i — -2dt\h-2D (p dt s f , (7.87) 


2a 


etc., for a finite time t = Ndt, in the limit N — > oo and dt —* 0 (at fixed t) we get, 


31 


. 2 a. 


Wn (0) = w 12 (0) exp l lim | 


N—>co 


1-2 Dj — 


1 ^ 


V 


p iV 


(7.88a) 


By the definition of the natural logarithm base e, n this limit is just cxp [-2D f/ y}, so that, finally: 


w, 


(0) = w u (0) expj exp{- 2D ip t) = w 12 (0) expJ expJ - ^ L 


. 2 a 


12 


(7.88b) 


So, due to coupling to environment, the off-diagonal elements of the density matrix decay with 
the characteristic dephasing time T 2 = 1 /2D (p , providing a natural evolution from the density matrix (22) 
of a pure state, to the diagonal matrix, 


w = 


W, 


0 ^ 


(7.89) 


with the same probabilities W\ j2 , describing a fully dephased (incoherent) classical mixture. 

Our simple model offers a very clear look at the nature of decoherence: “force” f(t), exerted by 
the environment, “shakes” the energy difference between two eigenstates of the system and hence the 
instant velocities 2{a z - f)/h of their mutual phase shift (pit) - cf. Eq. (24). Due to randomness of the 
force, <p{t) performs a random walk around the trigonometric circle, so that eventually, averaging of its 
trigonometric functions cxp {±/r/i} over the possible states of environment yields zero, killing the off- 
diagonal elements of the density matrix. Our analysis, however, has left open two important issues: 

(i) Is it approach valid for a quantum description of a typical environment? 

(ii) If yes, what is D p ? 


7.4, Fluctuation-dissipation theorem 

Similar questions may be asked about a more general situation, when the Hamiltonian H s of the 

system of interest ( 5 ), in the composite Hamiltonian (68), is not specified at all, but the interaction 
between that system its environment still has the bilinear form similar to Eqs. (70) and (6.130): 

H mi =-F{A}x, (7.90) 


31 This result is valid only if approximation (82) may be applied at time interval dt which, in turn, should be much 
smaller than T 2 , i.e. if the dephasing time is much longer that the environment’s correlation time t c . This 
requirement is usually well satisfied, because in most environments, t c very short. For example, in the original 
Brownian motion experiments with few- pm ink particles in water, it is of the order of the average interval 
between sequential molecular impacts, of the order of 10' 21 s. 

32 See, e.g.,MAEq. (1.2a). 
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where x is some observable of the subsystem s (say, a generalized coordinate or a generalized 
momentum). It may look incredible that in this very general situation one may make a very simple and 
powerful statement about the statistical properties of the generalized external force F, under only two 
(interrelated) conditions - which are satisfied in a huge number of cases of interest: 


(i) the coupling of system 5 of interest to environment e is weak - in the sense of the perturbation 
theory (see Chapter 6), and 

(ii) the environment may be considered as staying in thermodynamic equilibrium, with certain 
temperature T, regardless of the process in the system of interest. 33 

This famous statement is called the fluctuation-dissipation theorem (FDT). 34 Due to the 
importance of this fundamental result, let me derive it. 35 


Since by writing Eq. (68) we treat the whole system (5 + e) as a Hamiltonian one, 36 we may use 
the Heisenberg equation (4.199) to write 


ihF = 


F,H 


= f,h\ 


(7.91) 


because, as was discussed in the last section, operator F{X\ commutes with operators I f and x . 

Generally, very little may be done with this equation, because the time evolution of the environment’s 
Hamiltonian depends, in turn, on that of the force. This is where the perturbation theory becomes 
indispensable. Let us decompose the external force’s operator into the following sum: 

F{A} = ^ + F(t), with (f (t)j = 0, (7.92) 


where (until further notice) sign (...) means the statistical averaging over the environment alone. 37 From 
the point of view of system s, the first term of the sum (still an operator!) describes the average response 


33 The most frequent example of violation of these conditions is environment’s overheating by the energy flow 
from the subsystem. I leave it to the reader to estimate the overheating of a standard physical laboratory room by a 
typical dissipative quantum process - the emission of an optical photon by an atom. (Hint: extremely small.) 

34 The FDT was first derived by H. Callen and T. Welton in 1951, on the background of an earlier derivation of 
its classical limit by H. Nyquist in 1928, and the pioneering 1905 work by A. Einstein - see below. 

35 The FDT may be proved in several ways which are different from, and shorter than the one given in this section 
- see, e.g., either SM Secs. 5.5 and 5.6 (based on H. Nyquist’s arguments), or the original paper by H. Callen and 
T. Welton, Phys. Rev. 83, 34 (1951) - wonderful in its clarity. The longer approach I describe here, besides giving 
the important Kubo formula (109) as a byproduct, is a very useful exercise in the operator manipulation and the 
perturbation theory in its integral form, different from the differential form used in Chapter 6. If the reader is not 
interested in this exercise, he or she may skip the derivation and jump directly to the result expressed by Eq. 
(134), which uses the notions defined by Eqs. (114) and (123). 

36 We can always do that if the local environment is large enough, so that the processes in our subsystem would 
not depend on the type of boundary between it and the external environment; in particular we may assume the 
total system closed, i.e. Hamiltonian. 

37 For usual (“ergodic”) environments, without intrinsic long-term memories, this statistical averaging over an 
ensemble of environments is equivalent to averaging over relatively short times - much longer than the correlation 
time t c of the environment, but still much shorter than the characteristic time of evolution of the system under 
analysis, such as the dephasing time 73 and the energy relaxation time 7j - both still to be calculated. As was 
already mentioned, in most practical environments, r c is very short. Thus, for relatively “massive” (inertial) 
systems of interest the separation of the averaging into two steps is well justified. 
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of the environment to the system dynamics (possibly, including such irreversible effects as 
friction/viscosity), and has to be calculated with account of their interaction - as will do later in this 
section. On the other hand, the second tenn in Eq. (92) presents fluctuations of the environment, which 
exist even in the absence of system s. Hence, in the first nonvanishing approximation in the interaction 
strength, the fluctuation part may be calculated ignoring the interaction, i.e. treating the environment as 
being in the thennodynamic equilibrium: 38 


iftF = 


F,H. 


(7.93) 


Since in this approximation the environment’s Hamiltonian does not have an explicit dependence 
of time, the solution of this equation may be written combining Eqs. (4.175) and (4.190): 


F(t) = exp\ ! -H e eq t [ ^(o)expj -~rH e eq t . 


ft 


n 


(7.94) 


Let us use this relation to calculate the correlation function of fluctuations, defined similarly to Eq. (80), 
but paying close attention to the order of the time arguments (very soon we will see why): 


ft 


F(t)F(t')) = ( exp< l - H e t Lp(0)expi - ^-H e t\exp\ l -H e t' Lp(o)expi - l -H e t' \ ), 


ft 


ft 


ft 


(7.95) 


where the thermal equilibrium of environment is implied. We are at will to calculate this expectation 
value in any basis, and the best choice is evident, because in the environment’s stationary state basis, its 
Hamiltonian, the exponents in Eq. (95), and the density operator of the environment are all represented 
by diagonal matrices. Using Eq. (5), the correlation function becomes 


F(t]F(fJj = Tr 

-I 


wexp 


IX.4^(X x piX^4 ex pi7T^/l^(X x p|X^/ 


{fi 


ft 


w 


ex P1 T^et (o)expj expt l -H e t' lF(o) ex P1 


ft 


ft 


ft 


= IX exp{-^„,4 ex p{i^/|F„,„ exp{-^£/j 


(7.96) 


ft 


V ft 


ft 


= I w n \F m . 1 2 exp j i ( E " E "' ^ (t , where E = E- E , . 

t? I h J 

Here W„ are the Gibbs distribution probabilities, given by Eq. (23) with environment’s temperature T, 
and F nn ’ are the Schrodinger-picture matrix elements of the interaction force operator. 

We see that correlator (96) is a function of the difference r = t - t’ only (as it should be for 
fluctuations in a macroscopically stationary system), but may depend on the order of the operands. This 
is why let us denote this particular correlation function by upper index “+”, 


38 Here we assume that for the equilibrium, Eq. (92) has zero average, because if this is not so, this average part of 
force may be always included into the Hamiltonian of subsystem s. 
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k; (r) = (F(l)F(t')) = Z W, \F „„. f exp i -I , where E = E, - E, 


and its counterpart by upper index 



(7.97) 

Correlation 
functions 
of an 
operator 

(7.98) 


So, in contrast with classical processes, in quantum mechanics the correlation function of fluctuations 
F is not necessarily time- symmetric: 


*;(«■)- K~ r (r) = k; ( t)-K ; (- r) = (f(()f(f) - F(<')f(<)) 


2(IX|F„, 


■ A 

sin * 0, 

ti 


(7.99) 


so that f(?) gives a good example of a Heisenberg-picture operator whose “values”, taken in different 
moments of time, generally do not commute - the opportunity already mentioned in Sec. 4. 6. 39 

Now let us return to the force decomposition (92), and calculate the first (average) component of 
the force. In order to do that, let us write the formal solution of Eq. (91) as follows: 


m= 


iti 




(7.100) 


In the right-hand part of this relation, we cannot treat the Hamiltonian of the environment as an 
unperturbed (equilibrium) one, because the result would have zero statistical average. Hence, we should 
make one more step in our perturbative treatment, and take into account (in the first nonvanishing 
approximation) the effect of our system of interest ( 5 ) on the environment. To do this, let us write the (so 
far, exact) Heisenberg equation of motion for the environment’s Hamiltonian, 


ihH e 


H e ,H 


= -x 


H e ,F 


(7.101) 


and its formal solution, similar to Eq. (100), but for an arbitrary time t’ rather than t: 

H e (f) = ~ } x(t")[H e (t”\ F(t"^dt " . 
in J 


(7.102) 


Plugging this equality into the right-hand part of Eq. (100), and averaging the result (again, over the 
environment only!), we get 



fi 2 


J df J dt " [/)„(/”), F(/”)]J) . 


(7.103) 


As we will see imminently, this expression gives a nonvanishing result even if the right-hand- 
part averaging is carried over the unperturbed (thermal-equilibrium) environment, so that unless we are 
interested in higher-order corrections, there is no need to refine the result any further. This fact enables 
us to calculate the average in the right-hand part of Eq. (103) absolutely similarly to that in Eq. (96), 
using Eq. (94): 


39 A good sanity check here is that at r= 0, the difference (99) between Kp{f) and vanishes. 
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([•A'), [h ,('1 „ ■ A" )] 1) = Tr{w [F (f), [H,F(r )] ] } 

= Tr {w [f(<’)H,F((”)-F((')f((”)H, - H,F(<")F(<’)+ F(<")H,F(F)]} 

= [ F At'KF, ,(>")- Fjf)F„, „((")£„ -£.F„.(t")Fjt’)+Fjr)E,.F , „(<")] 

n,n' 

= -Z»',F|F„. 

n,n' 

Now, if we try to integrate each term of this sum, as Eq. (103) seems to require, we will see that 
the lower-limit substitution (at t ’, t” — > - oo) is uncertain, because the exponents oscillate without decay. 
This technical difficulty may be overcome by the following reasoning. As illustrated by the example 
considered in the previous section, coupling to a disordered environment makes the “memory horizon” 
of the subsystem of our interest (5) finite: its current state does not depend on its history beyond certain 
time scale - in that example, the dephasing time IT (Actually, this is true for virtually all real physical 
systems, in contrast to the idealized models such as a dissipation-free pendulum that swings for ever and 
ever with the same amplitude.) As a result, the functions under integrals of Eq. (103), i.e. the sum (104), 
should self-average at a certain finite time. One simple technique for expressing this fact mathematically 
is just dropping the lower-limit substitution; this would give the correct result for Eq. (103). However, a 
better (mathematically more acceptable) trick is to first multiply the function under each integral by, 
respectively, exp j a(t - t ’)} and exp j a(t - t ’)}, where a is a very small positive constant, then carry out 
the integration, and after that take the limit a — > 0. The physical justification of this procedure may be 
provided by saying that system’s behavior should not be affected if its interaction with the environment 
was not kept constant but was turned on gradually - say, exponentially with an infinitesimal rate a. With 
this modification, Eq. (103) becomes 

{A)) = -TZlFj|F m .| 2 lim„ 0 j dfjdt"x(t") 

^ n,n' — 00 -00 

This double integration is over the area shaded in Fig. 6, so that the order of integration may be changed 
to the opposite one as 

tv t t t r 

\dt'\dt"...= \dt"\dt'...= \dt"\dd..., (7.106) 

—00 -00 -00 t" —00 0 

where f = t-t\ and r = t - t”. 


expy—(t' - 1")+ a(t" - 1)> + c.c. |. (7.105) 


expj t") | + c.c. 


(7.104) 


t" 



Fig. 7.6. 2D integration area 
in Eqs. (105) and (106). 
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As a result, Eq. (105) may be rewritten as a single integral, 

t oo 

= \G{t- 1") x(t")dt" = J G(z) x(t - z)dz, 

-00 0 


(7.107) 


whose kernel, 


G(T>0)=~Y< W .E\ F J lim .^] 


expt i 


i — (r-r')-^rl + c.c. 


dz' 


. Ez 


= lim ^<u2XI F ..'l si n -r e ~ ST =t2XA 


: . Ez 
sin 


(7.108) 


does not depend on the particular law of evolution of the subsystem ( 5 ) under study, i.e. provides a 
general characterization of its coupling to the environment. 


In Eq. (107) we may readily recognize the most general fonn of the linear response of a system 
(in our case, the environment), taking into account the causality principle, where G( z) is the response 
function (also called the “temporal Green’s function”) of the environment. 40 Comparing Eq. (108) with 
Eq. (99), we get a wonderfully simple universal relation, 41 


/r A A -1 

\ 

/|F(r),F(0) 

) = ifiG{z ) . 


(7.109) 


that emphasizes once again the quantum nature of the correlation function’s time asymmetry. However, 
the relation between G( z) and the force anh-commutator, 


a n /a a a \ 

(i t + r ),F(t) ) = {F(t + z)F(t) + F(t)F(t + z)) = K + F {z)+ K F {z) 


(7.110) 


is much more important because of the following reason. Relations (97)-(98) show that the so-called 
symmetrized correlation function, 


= I(^(r),F( <4 = li 2>„|F„, 


= I^,fcos 


2 

E z 
h 


,2 Ez - 2£ \ t \ 

cos e 1 1 

1 n 


(7.111) 


that is evidently an even function of time difference z, looks very similar to the response function (108), 
“only” with another trigonometric function under the sum. This similarity my be used to obtain an exact 
algebraic relation between the Fourier images of these two functions of z. Indeed, function (111) may be 
represented as the Fourier transform 42 


40 For a more detailed discussion of this function and the causality principle, see, e.g., CM Sec. 4.1. 

41 Thi s relation, called the Kubo (or “Green-Kubo”) formula, after the works by M. Green (1954) and R. Kubo 
(1957), does not come up in the easier derivations of the FDT, discussed in the beginning of this section. 

42 Due to their practical importance, and certain mathematical issues with their justification for random functions, 
Eqs. (1 12)-(1 13) have their own grand name, the Wiener-Khinchin theorem, though the math rigor aside, they are 
just a straightforward corollary of the Fourier integral transform (1 15) - see, e.g., SM Sec. 5.4. 


Ensemble 
average of 
environment’s 
response 


Fluctuation 
commutator 
via Green’s 
function 


Symmetrized 

correlation 

function 


Chapter 7 


Page 24 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 


K f (t)= jS F (co)e U0T dm = 2 j S f (co)coso>t dco , 

-oo 0 

with the reciprocal transform 

1 +CO < +co 

S F (a>) = — \ K F (T)e' cor dr = — [ (r) cos <ur dr . 

2n J ^ J 


(7.112) 


via the symmetrized spectral density of variable F, defined as 


Symmetrized 

spectral 

density 


S p (»)<?(» - a') = i lF,F_. + F ,.F, ) = i( jF, , F „. [), 


(7.113) 


(7.114) 


where F a) (also an operator rather than a c-number!) is defined as 


F. = 


In 


J F(t)e icot dt, so that F(t) = J F a e~ icot dt . 


(7.115) 


The physical meaning of function Sf{co ) becomes evident if we write Eq. (112) for the particular 
case r = 0: 


/ * \ -rw ^ 

K f (0) = (F 2 \ = J S F ( co)d(o = 2 J S F (cd)dco . 


(7.116) 


This formula implies that if we pass function F(t) through a linear filter cutting from its frequency 
spectrum a narrow band dm of real (positive) frequencies, then variance (Ff ) of the filtered signal F/(t) 
would be equal to 2 Sf{co)dco - hence the name “spectral density”. 43 

Let us use Eqs. (Ill) and (1 13) to calculate the spectral density for our model: 


= J* cos ^ e 1 l e l0}T dr 


Et - £ \A 


In 


1 +oo 

^£^+,,7 lim„. j 

n n' a 


exp: i 


.Et 


■ + c.c. 


r ET e lWT dz 


(7.117) 


— Y.W\F n ,X\ im 

9 _ i—i "I nn I 

^ ** n n ' 


G — >0 


1 1 

i(E /h + co)-£ i(- E / ti + coj 


-£ 


Now it is a convenient time to recall that each of the two summations here is over the eigenenergy 
spectrum of the environment whose spectrum is virtually continuous because of its large size, so that we 
may transform each sum into an integral just as this was done in Sec. 6.6: 


Z- - * \-dn=\...p{E n )dE n . 


(7.118) 


43 An alternative popular measure of spectral density is Sf(v) = ( Ff)/dv = 4nSf{(o), where v = co/ln is the 
“cyclic” frequency (measured in Hz). 
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where p(E) is the density of environment’s states at a given energy. This transformation yields 

1 1 


S F M = T lim„o 1 dE,W(E, )p(E„ )j dE,p(E, )|F_.| 


i[E / h - co]- £ i(-E /h-a>)-£ 


. (7.119) 


Since the square bracket depends only on a specific linear combination of two energies, E =E n - E n , , it 
is convenient to introduce also another, linearly-independent combination of the energies, for example, 
the average energy E = (E n + E n , )/ 2 , so that the state energies may be presented as 


- E - E 

E n =E + -, E„,=E . 

"2 2 


(7.120) 


With this notation, Eq. (119) becomes 


V M- - Inn J 


\W 

jir 


e + £ 
2 


p 


E+ — 


E + l 

2 


P 


\P 


E+ — 
2 


e-B 

2 


P 


ip |2 dE 

i(E -hco]-h£ 


e-B 

v 


l F - 


|2 dE 
/ (- E - hco]- tl£ 


(7.121) 


Due to the smallness of parameter fi£ (which should be much less than all real energies, including k B T, 
hco, E n , and E n ), each of the internal integrals is dominated by an infinitesimal vicinity of one point, 
E ± = ±hco, in which the spectral density, matrix elements, and the Gibbs probabilities do not change 
considerably, and may be taken out of the integrals, so that they may be worked out explicitly: 44 


S f (g>) = -^hm_ { dEp+p_ 


WAF ' 


T'JJ 

It 


dE 


i(e -hco)-h£ 


+ W \F_ 


1 : 


h_ 

2n 


|im . 


2 i 


+oo 

2 r -i\i i - 


= n -\p + p\wM+W_\F_[]dE, 


[e -hco)-h£ 
[e - hco] +{fl£) 2 


dE + W_\F_f J 


dE 

i (- E -hco]-h£ 

i{]E + hco)-h£ 

( E + hco ) +{h£) 2 


dE 


(7.122) 


where indices ± mark function values at the special points E = ±hco, i.e. E„ = E n -± hco. The physics of 
these points becomes simple if we interpret state n, that is the argument of the equilibrium Gibbs 
distribution function W„, as the initial state of the environment, and n ’ as its finite state. Then the top- 
sign point corresponds to fv = E n - hco, i.e. to the emission of one energy quantum hco of the 
“observation” frequency co by the environment into subsystem s of interest, while the bottom-sign point 
E n ■ = E n + hco, corresponds to the absorption of such quantum by the environment. As Eq. (122) shows, 
both processes give similar positive contributions into force fluctuations. 


44 Using, e.g., MA Eq. (6.5a). (The imaginary parts of the integrals vanish, because integration in infinite limits 
may be always re-centered to finite points ±hco.) A mathematically enlightened reader may have noticed that the 
integrals might be taken without the introduction of small s, using the Cauchy theorem - see MA Eq. (15.1). 
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The situation is different for the Fourier image of the response function G( r), 45 


X(P>)= \G(r)e lC0T dr , 
0 


(7.123) 


Generalized 

susceptibility 


that is frequently called either the generalized susceptibility or the response function - in our case, of the 
environment. Its physical meaning is that the complex function y( co) = % '( co) + iy”( co) relates the 
Fourier amplitudes of the generalized coordinate and generalized force: 46 



(7.124) 


The physics of its imaginary part %”(cb) is especially clear. Indeed, if both F c0 and x m represent a 
sinusoidal classical process, say 


x(t) = x 0 cos cat = ^e~ ia * + ^e +ia * , i.e. x=x=^ 

w 0 2 2 * -co 2 


(7.125) 


Then, in accordance with the correspondence principle, Eq. (124) should hold for the c-number complex 
amplitudes F m and x fih enabling us to calculate the time dependence of force, 




■f* 


\z{co)e 1(01 + %{- co)e +l 

/' + ix")e~ lWt + {/.' - l%")e +uot \ = x 0 [/(ry)cos cot - z”(co) sin cot] 


+icot 


(7.126) 


We see that x”( ( o) scales the part of the force that is zr/2-sh i Tied from the coordinate oscillations, i.e. is 
in phase with those of velocity, and hence characterizes the time-average power flow from the system 
into the environment, i.e. the energy dissipation rate: 47 


P = F(t)x(t) = x 0 [x'{co) cos cot - x"(co)smcot\{- cax 0 sin cot) = -^-coz”{co ) . 


(7.127) 


Let us calculate this function from Eqs. (108) and (123), just as we have done for the spectral 
density of fluctuations: 




n,n 

f 


+00 1 

I 2 1- T f i 


Ez 

1 

hm^olml — 
o 2* 
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45 Integration in Eq. may be extended to the whole time axis, - oo < t< +oo, if we complement definition (107) of 
G( r) for r > 0 with its definition as G( r ) = 0 for r < 0, in correspondence with the causality principle. 

46 In order to prove this relation, it is sufficient to plug expression x s = x ot e , or any sum of such exponents, 
into Eqs. (107) and then use definition (123). This simple exercise is highly recommended to the reader. 

47 The expression P = Fx = Fv used for the instant power flow is evident if x is the usual Cartesian coordinate of 
a mechanical system. According to analytical mechanics (see, e.g., CM Chapters 2 and 10), it is valid for any 
generalized coordinate - generalized force pair which forms the interaction Hamiltonian (90). 
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Making the transfer (118) from the double sum to the double integral, and then the integration variable 
transfer (120), we get 
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(7.129) 


Now using the same argument about the smallness of parameter £ as above, we may take the spectral 
densities, matrix elements of force, and the Gibbs probabilities out of the integrals, and work out the 
integrals, getting a result very similar to Eq. (122): 


X\a>) = n\p + p_\r_\F_\ 2 ~W + \F< 


dE. 


(7.130) 


In order to relate these results, it is sufficient to notice that according to Eq. (23), the Gibbs 
probabilities W± are related by coefficients dependent on only the temperature T and observation 
frequency ox 
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(7.131) 


so that both the spectral density and the dissipative part of susceptibility may expressed via the same 
integral over environment energies: 
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(7.132) 

(7.133) 


and hence are universally related as 
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(7.134) 


This is the Callen-Welton’s fluctuation-dissipation theorem (FDT). It reveals the fundamental, 
intimate relation between dissipation and fluctuations induced by environment (“no dissipation without 
fluctuations”) - hence the name. 48 In the classical limit, tico« k\>J, the FDT is reduced to 


48 A curious feature of the FDT is that Eq. (134) includes the exactly same function of temperature as the average 
energy (26) of a quantum oscillator of frequency co , though, as the reader could witness, the notion of the 
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(7.135) 


In most systems of interest the last fraction tends to a finite (positive) constant in a substantial range of 
relatively low frequencies. Indeed, expanding Eq. (123) in the Taylor series in small co, we get 


(7.136) 


Since the temporal Green’s function is real by definition, the Taylor expansion of x ”( (0 ) — hn^ty) starts 
with the linear term icot), where i] is a certain real coefficient and unless ij = 0, is dominated by this 
term at small co. (The physical sense of constant rj becomes clear if we consider an environment that 
provides viscous friction with the simple law 



F) = -rjx, ?]>0. 


(7.137) 


For the Fourier images of coordinate and force this gives the relation F m = icoxa, so that according to Eq. 
(124), 




CO 


CO 


(7.138) 


Hence, even in the general case, coefficient // may be considered as an effective low-speed viscosity 
provided by the environment.) 

In this case Eq. (134) turns into the Nyquist formula : 49 


1 T 

S F (co) = i.e. (f f ) = Ak B Trjdv . 

Jt ' ' 


(7.139) 


According to Eq. (112), if such a constant spectral density 50 persisted at all frequencies, it would 
correspond to a delta-correlated process F(t), with 


K f (t ) = 2n S f (0)S(t) = 2k B TrjS(r) , 
similar to already discussed above - see Eq. (82). 


(7.140) 


oscillator was by no means used in its derivation. As will see in the next section, this fact leads to rather 
interesting consequences and even conceptual opportunities. 

49 Actually, the 1928 work by H. Nyquist was about electronic noise in resistors, just discovered experimentally 
by his Bell Labs colleague J. Johnson. For an Ohmic resistor, as a dissipative “environment” of the electric circuit 
it is connected with, Eq. (137) is just the Ohm’s law, and may be recast as either (V) = - R(dQ/dt ) = Rl, or (7) = - 
G(#/(/t) = GV. Thus for voltage V in an open circuit, // corresponds to resistance R, while for current / in the 
short circuit, to conductance G = MR. In this case, the fluctuations described by Eq. (139) are referred to as the 
Johnson-Nyquist noise. (Because of this important application, any model leading to Eqs. (136)-(137) is 
frequently referred to as Ohmic dissipation, even if the physical nature of variables x and F is quite different.) 
Another note: the Nyquist formula (139) should not be confused with the Nyquist-Shannon theorem describing 
the minimum sampling rate of an analog signal. 

50 A random process whose properties may be reasonably approximated by constant spectral density is frequently 
called the white noise, because then it is a random mixture of all possible sinusoidal components with equal 
weights, reminding natural white light’s composition. 


Chapter 7 


Page 29 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 


Since in the classical limit the right-hand part of Eq. (109) is negligible, and the correlation 
function may be considered an even function of time, the symmetrized function under the integral in Eq. 
(113) may be rewritten just as (F(z)F(0)). In the limit of low observation frequencies (in the sense that <x> 
is much smaller than not only the quantum frontier k^T/Ti, but also the frequency scale of function 
Z”(co)/oi), Eq. (138) may be used to recast Eq. (135) in the form 51 

= — f<F(r)F(0)yr. (7.141) 

® k J 0 

To conclude this section, let me return for a minute to the questions formulated in our earlier 
discussion of dephasing in the two-level model. In that problem, the dephasing time scale is T 2 = l/2Tfy. 
Hence the classical approach to the environment, used in Sec. 3, is adequate if hD^ « k\{T. Next, we 

A /V 

may identify operators / and a, participating in Eq. (70) with, respectively, operators F and x of the 
general Eq. (90). Then the comparison of Eqs. (82), (88) and (140) yields 

1 4-k T 

V = 2D r = ~^ r? ’ (7 ’ 142) 

1 2 >1 

so that, for the model described by Eq. (137) with temperature-independent viscosity //, the dephasing 
rate is proportional to temperature. 


7.5. The Heisenberg-Langevin approach 

The fluctuation-dissipation theorem opens a very simple and efficient way for analysis of the 
system of interest (s in Fig. 1). It is to write its Heisenberg equations (4.199) of motion for relevant 
operators, which would now include the environmental force operator, and explore these equations 
using the Fourier transform and the Wiener-Khinchin theorem (1 12)-(1 13). Such approach to classical 
equations of motion is commonly associated with the name of Langevin, 52 so that its extension to 
dynamics of Heisenberg-picture operators is frequently referred to as the Heisenberg-Langevin (or 
“quantum Langevin”) approach to open system analysis. 53 

Perhaps the best way to describe this method is to demonstrate how it works for the very 
important case of a ID harmonic oscillator, so that the generalized coordinate x of Sec. 4 is just the 
oscillator’s coordinate. For the sake of simplicity, let us assume that the environment provides the 
simple Ohmic dissipation described by Eq. (137) - which is a good approximation in many cases. As we 
already know from Chapter 5, the Heisenberg equations of motion for operators of coordinate and 
momentum of the oscillator, in the presence of external force, are 


51 In some fields (especially in physical kinetics and chemical physics), this particular limit of the 
Nyquist formula, is called the Green-Kubo (or just “Kubo”) formula. As was discussed above, these 
names may be more reasonably associated with Eq. (109). 

52 After P. Langevin, whose 1908 work was the first systematic development of Einstein’s ideas (1905) of the 
Brownian motion theory in the random force language, as an alternative to M. Smoluchowski’ s approach using 
the probability density language - see Sec. 6 below. 

53 Perhaps the largest credit for this extension belongs to M. Lax whose work, in the early 1960s, was motivated 
mostly by quantum electronics applications - see, e.g., his monograph M. Lax, Fluctuation and Coherent 
Phenomena in Classical and Quantum Physics , Gordon and Breach, 1968, and references therein. 


Dephasing 
time via 
viscosity 


Chapter 7 


Page 30 of 58 


Essential Graduate Physics 


QM: Quantum Mechanics 


^ Jj ^ 2 /v A 

x = — , p = -ma>QX + F, 
m 

so that using Eqs. (92) and (137), we get 

x = — , p = -m 0 o 2 x- tjx + F(t). 
m 

Combining Eqs. (144), we may write their system as a singe differential equation 

mx + r/x + m (OqX = F(t ) , 


(7.143) 


(7.144) 


(7.145) 


that is absolutely similar to the classical equation of motion. 54 (In the view of Eqs. (5.42) and (5.48), 
whose corollary the Ehrenfest theorem (5.49) is, this should be by no means surprising.) For the Fourier 
images of the operators, defined similarly to Eq. (115), Eq. (145) gives the following relation, 


F„ 


x = 


m(a>l - co 1 )- 


irjco 


(7.146) 


that should be also well known to the reader from the classical theory of forced oscillations. However, 
since the Fourier components are still Heisenberg-picture operators, and their “values” for different co 
do not commute, we have to tread carefully. The best way to proceed is to write a copy of Eq. (146) for 
frequency (-co’), and then combine these equations to fonn a symmetrical combination similar that used 
in Eq. (114). The result is 


— {XX , + X ,x ) = 

2 \ ffl -<» -a co 1 


m{a>l - co 2 )- irjco 


F.F 


+ F ,F 

—CO (L 


(7.147) 


Since the spectral density definition similar to Eq. (114) is valid for any observable, in particular for x, 
Eq. (147) allows us to relate the symmetrized spectral densities of coordinate and force: 


S x (co) = 


S F (co) 


S F (co) 


\m(co 2 - co 2 ) - 


ir/col 


m 2 [col - co 1 1 + (rjcof 


(7.148) 


Now using an analog of Eq. (116) for jc, we can calculate coordinate’s variance: 

S F (co)dco 


= K x ( 0)= j ' S x (co)dco =2 \ 


l( 2 2 

o m - co 


)" + {ijcd ) 2 


(7.149) 


where now, in contrast to the notation used in Sec. 4, sign (...) means the averaging over the usual 
statistical ensemble of many systems of interest - in our current case, of many harmonic oscillators. 

If the coupling to environment is so weak that viscosity ;/ is small (in the sense that the 
oscillator’s dimensionless Q-factor is large, Q = mcoohj » 1), this integral is dominated by the 
resonance peak in a narrow vicinity, I co - cool = I £, I <SC coo, of its resonance frequency, and we can take 
the relatively smooth function Sf(co) out of the integral, thus reducing it to a table integral: 55 


54 See, e.g., CM Sec. 4.1. 

55 See, e.g., MA Eq. (6.5a). 
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With the account of the FDT (134) and Eq. (138), this gives 
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(7.151) 


But this is exactly Eq. (48) that was obtained from the Gibbs distribution, without any explicit account 
of the environment - though keeping it in mind by using the notion of the thermally-equilibrium 
ensemble. 56 (Notice that the viscosity coefficient 77, that characterizes the oscillator-to-environment 
interaction strength, has cancelled!) Does this mean that we have toiled in vain? 


By no means. First of all, the FDT result has an important conceptual value. For example, let us 
consider the low-temperature limit k B T « ficoo, when Eq. (15 1) is reduced to 


x = 


2 mco n 


(7.152) 


Let us ask a naive question: What exactly is the origin of this coordinate uncertainty? From the point of 
view of the usual quantum mechanics of closed (Hamiltonian) systems, there is no doubt: this 
nonvanishing variance of coordinate is the result of the final spatial extension of the ground-state 
wavefunction, reflecting the Heisenberg’s uncertainty relation (that in turn results from the fact that the 
operators of coordinate and momentum do not commute) - see Eq. (2.271). However, from the point of 
view of the Heisenberg-Langevin equation (145), variance (152) is an unalienable part of the oscillator’s 
response to the fluctuation force f(i) exerted by the environment at frequencies co « coq. Though it is 
impossible to refute the former, absolutely legitimate point of view, in many applications it is much 
easier to subscribe to the latter standpoint, and treat the coordinate uncertainty as the result of the so- 
called quantum noise of the environment. This notion has received numerous confirmations in 
experiments that did not include any oscillators with the eigenfrequencies coq close to the noise 
measurement frequency co? 1 

The advantage of the Heisenberg-Langevin approach is that for any 77 > 0 it is possible to 
calculate the (experimentally measurable!) distribution S x (cd), i.e. decompose the fluctuations into 
spectral components. This procedure is not restricted to the limit of small q (large Q factors); for any 
damping we may just plug the FDT (134) into Eq. (149) and integrate. As an example, let us have a look 
at the so-called quantum diffusion. A free ID particle may be considered as the particular case of a ID 
harmonic oscillator with coo = 0, so that combining Eqs. (134) and (149), we get 


56 By the way, the simplest way to calculate S?co), i.e. to derive the FDT, is to require that Eqs. (48) and (150) 
give the same result for an oscillator with any eigenfrequency co. This is exactly the approach used by H. Nyquist 
(for the classical case) - see also SM Sec. 5.5. 

57 See, for example, R. Koch et al., Phys. Lev. B 26, 74 (1982).. 


Chapter 7 


Page 32 of 58 





Essential Graduate Physics 


QM: Quantum Mechanics 


x = 


f 

= 2 I 


S F {co)d(o 
(mco 2 ) 2 + (r/co) 2 


T'-'- 

= M 


ft® , ft® . 
coth dco. 


(mco 2 ) 2 +(rja>y 2n 2k B T 


(7.153) 


This integral has two divergences. The first one, of the type \dcolof at the lower limit, is just a 
classical effect: according to Eq. (85), particle’s displacement variance grows with time, so it cannot 
have a finite time-independent value that Eq. (153) tries to calculate. However, we still can use that 
result to single out the quantum noise effect on diffusion - say, by comparing it with a similar but purely 
classical case. These effects are prominent at high frequencies, especially if the quantum noise 
overcomes the thermal noise before the dynamic cut-off, i.e. if 
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In this case there is a broad range of frequencies where the quantum noise gives a substantial 
contribution to the integral: 
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Formally, this contribution diverges at either m — > 0 or T — > 0, but this logarithmic (i.e. extremely weak) 
divergence is readily quenched by an almost any change of the environment model at very high 
frequencies, where the “Ohmic” approximation given by Eq. (136) becomes unrealistic. 

The Heisenberg-Langevin approach is extremely simple and powerful, 58 but is has its limitations. 
The main one is that if the equations of motion for the Heisenberg operators are not linear, there is no 
linear relation, such as Eq. (146), between the Fourier images of the generalized force and generalized 
coordinate, and as the result there is no simple relation, such as Eq. (148), between their spectral 
densities. In other words, if the Heisenberg equation of motion are nonlinear, there is no regular simple 
way to use them to calculate statistical properties of the observables. For example, let us return to the 
dephasing problem described by Eqs. (68)-(70), and assume that the generalized force is characterized 
by relations similar to (93) and (134). Now writing the Heisenberg equations of motion for the two 
remaining spin operators, and using the commutation relations between them, we get 




o a~ fit ) 

= — 2. cr 


cr, 


O a ~ fit) A 
= 2 : cr, 


(7.156) 


These equations do not provide a linear relation between the Pauli operators and the fluctuation force, so 
even if we know spectral properties of the latter from the FDT, this does not help too much - unless we 
return to the approximate, classical approach described in Sec. 3 above. 59 


58 Its natural generalizations enable analyses of fluctuations in arbitrary linear systems, i.e. the systems described 
by linear differential (or integro-differential) equations of motion, including those with many degrees of freedom, 
and distributed systems ( continua ). 

59 For some calculations, this problem may be avoided by linearization : if we are only interested in small 
fluctuations, the Heisenberg equations of motion may be linearized about their expectation values (see, e.g., CM 
Sec. 4.2), and the linear equations for variations solved either as has been shown above, or (if the expectation 
values evolve in time) by their Fourier expansions. 
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7.6. Density matrix approach 

The main alternative approach, that is essentially a generalization of that used in Sec. 2, is to 
extract the final results from the dynamics of the density matrix of our subsystem s of interest (which, 
from this point on, will be called w, v ). I will discuss this approach in detail, 60 cutting just a few technical 
corners, in each case referring the reader to special literature. 

We already know that the density matrix allows the calculation of the expectation value of any 
observable of system s - see Eq. (5). However, our initial recipe (6) for the density matrix calculation, 
which requires the knowledge of the exact state (2) of the whole Universe, is not too practicable, while 
the von Neumann equation (66) for the density matrix evolution is limited to cases in which 
probabilities W/ of the system states are fixed - thus excluding such important effects as the energy 
relaxation. However, such effects may be analyzed using a different assumption - that the system of 
interest interacts only with some local environment (say, with the lab room) that is in the thermally- 
equilibrium state described by a diagonal density matrix - see Eqs. (15) and (23). 

This calculation is facilitated by the following observation. Let us number the basis states of the 
full local system (the system of our interest plus its local environment) by index /, and apply Eq. (5) to 
write 


(A) = Tr(iw)=X4/-w,v (7-157) 

where w is the statistical operator of this full composite system. At weak interaction between the 
system s and local environment e, their variables reside in different Hilbert spaces, so that we can write 

|/) = |^.}®|e,). (7.158) 


and if observable A depends only on the coordinates of system 5, Eq. (157) yields 
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(7.159) 


s, ) = ^ i (Aw s ), 


^ s Z( e *l%-) = Tr >- 

k 


(7.160) 


Since Eq. (159) is similar to Eq. (5), w s may serve as the statistical operator defined in the Hilbert space 

of the system of our interest. The huge advantage of Eqs. ( 1 59)-( 1 60) is that they are valid for an 
arbitrary state of the local environment, including the case when it is in the thermodynamic equilibrium. 
By the way, the similarity of Eqs. (5) and (159) may serve as the strong argument, promised in Sec. 1, 
for the validity of the former relation even if the Universe as a whole is not in a pure state. (The 
argument is, however, imperfect, because the latter relation has been derived from the former one.) 


60 As in Sec. 4, the reader not interested in the derivation of the basic equation (181) for the density matrix 
evolution may immediately jump to the discussion of this equation and its applications. 
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Now, since at a sufficiently large size of the local environment e, the composite system (5 + e) 
may be considered Hamiltonian, with fixed probabilities of its states, for the description of time 
evolution of its statistical operator w (again, in contrast to that, w s , of the system of our interest) we 

may use the von Neumann equation (66). Partitioning its right-hand part in accordance with Eq. (68), we 
get: 

ifiw= H s ,w + H e ,w + H iat ,w . (7.161) 

The next step is to use the perturbation theory to solve this equation in the lowest order in H int that 

yields nonvanishing results due to the interaction. For that, Eq. (161) is not very convenient, because its 
right-hand part contains two other tenns, which are much larger than the interaction Hamiltonian. To 
mitigate this technical difficulty, the interaction picture (which was discussed in the end of Sec. 4.6), is 
very handy - though not absolutely necessary. 

As a reminder, in that picture (whose entities will be marked with index /, with the unmarked 
operators assumed to be in the Schrodinger picture), both the operators and the state vectors (and hence 
the density matrix) depend on time. However, the time evolution of the operator of any observable A is 
described by Eq. (67) with the unperturbed part of the Hamiltonian only - see Eq. (4.214). In our 
current case (68), this means 

itiA, =[A i ,H q \ (7.162) 

where the unperturbed Hamiltonian consists of two independent parts: 

H 0 = H S +H e . (7.163) 

On the other hand, the state vector evolution is governed by the interaction evolution operator u I that 
obeys Eqs. (4.215). Since this equation, using the interaction-picture Hamiltonian (4.216), 

ft, = ulH ml u () , (7.164) 

is absolutely similar to the ordinary Schrodinger equation using the full Hamiltonian, we may repeat all 
arguments given in the beginning of Sec. 3 to conclude that the dynamics of the density matrix in the 
interaction picture of a Hamiltonian system is governed by the following analog of the von Neumann 
equation (66): 

ifiWj = ftj,Wj . (7.165) 

Since this equation is similar in structure (with the opposite sign) to the Heisenberg equation (66), we 
may use solution Eq. (4. 190) of the latter equation to write its analog: 61 

Wj{t) = Uj(t, 0)w(0)uj (f.O). (7.166) 

It is also straightforward to verify that in this picture, the expectation value of any observable A may be 
found from the expression similar to the basic Eq. (5): 

61 Notice the opposite order of the unitary operators, which results from the already mentioned sign difference. 
Note also that we could write a similar expression in the Schrodinger picture: w{t) = uw(0)u ' , where u is the full 
time-evolution operator. 
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(A) = Tr(i 7 wJ, (7.167) 

so that the interaction and Schrodinger pictures give the same final results. 

In the most frequent case of bilinear interaction (90), 62 Eq. (162) is readily simplified, in 
different ways, for the both operators participating in the product. In particular, for A = x , it yields 

ihx, = x n H 0 = x n H s + Xj,H e . (7.168) 

Since operator of coordinate is defined in the Hilbert space of system s, it commutes with the 
Hamiltonian of the environment, so that we finally get 

ihxj = Xj,H s . (7.169) 

On the other hand, taking A = F , we should take into account that the last operator is defined in the 
Hilbert space of the environment, and commutes with the Hamiltonian of the unperturbed system s. As a 
result, we get 

ihF, = [f,,h\ (7.170) 

This means that with our time-independent unperturbed Hamiltonians H s and // , the time evolution of 

the interaction-picture operators is rather simple. In particular, the analogy between Eq. (170) and Eq. 
(93) allows us to immediately write the following analog of Eq. (94): 

Fi (?) = exp j.F(0)exp j- , (7.171) 

so that in the stationary (eigenstate) basis of the environment, 

(6LrO = exp(ii?/)F„„,(0)exp(-iis,,(J = F m ,(0)exp(-iAzAA i (7.172) 

and similarly (but in the basis of the eigenstates of system ,v) for operator x . As a result, Eq. (164) may 

be also factored: 

= 4 (t, 0)AA f 0) = exp jj- (H t + H , )/)(- xF )exp(- !(#,+#.)() 

exp {i^’T exp {~i^'7 cxp !^ H -'p (0)cxp { '} =-x,{t)F I {t). 

Now, as in Sec. 4, we may rewrite Eq. (165) in the integral form: 

w,(t) = 1 } \h,((\w, (7.174) 


62 A similar analysis of a more general case, when the interaction with environment may be represented as a sum 
of products of the type (90), may be found in a monograph by K. Blum, Density Matrix Theory and Applications, 
3 rd ed.. Springer, 2012. 
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plugging this result, for time t\ into the right-hand part of Eq. (174) again, we get 



~ ][h, (t)\H, (4 *,(/’)]]<*' = ~ J 


*,(0 


w. 


(7.175) 


where, for the notation brevity, from this point on I will strip operators x and F of their index I. (Their 
time dependence indicates the interaction picture clearly enough.) 

So far, this equation is exact (and cannot be solved analytically), but this is the right time to 
notice that even if we take the density matrix in its right-hand part equal to its unperturbed value 
(corresponding to no interaction between system s and its thennally-equilibrium environment e), 

Wi(t')^w s {t')w e , with (e n \w e \e n ,) = W n S nn ,, (7.176) 


where e„ are the stationary states of the environment and W„ are the Gibbs probabilities (23), Eq. (175) 
would still provide some nonvanishing time evolution of the density operator. This is exactly the first 
nonvanishing perturbation we have been looking for. Now using Eq. (160), we find the equation of 
evolution of the density operator of our system of interest: 



V 


J Tr„ [i(f)m (f)w, ]]<*' , 


(7.177) 


where the trace is over the stationary states of the environment. In order to spell out the right-hand part 
of Eq. (177), note again that the coordinate and force operators commute with each other (but not with 
themselves at different time moments!) and hence may be swapped, so that we may write 


- Mt')*, (t'W<k [fir k.fi<)]+ *, Tr, [a.fififiO] 

= fifir)*, (i'JZ F -.' ('Kv, (t'W, - 4')*, fW'OE (<’) 


(7.178) 


■ F nn( t ') W n’ F n'n W + W n F nn{t') F n’n (4 


Since the summation on both indices n and n ’ in this expression is over the same energy level set (of all 
eigenstates of the environment), we may swap the indices in any of the sums. Doing that in the terms 
with factors W„ -, we turn them into W n , so that this factor becomes common: 


n,n' 

- x(t')w s x(t) F n’ n {t') F nn' W + ™ n n ' (*' ) F n’n W]- 


Now using Eq. (172), we get 


(7.179) 
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; [x(t)x(t')w s (r)expjz ^ j - x{t)w s (r)x(r)exp|- / 


-x(t')w s x{t) 


expj i 


£(<-'01 


+ W„X 


(t')x{t) 


expt - 1 


,.£('-'01 


(7.180) 


= Tj V n\ F nn-f cos ^ ^ [x(t), [x(t'\ w, (f )] ] ■ + i^ K \F m , | 2 sin ^ ? ^ [x(t\ {x{f ), w, (f )} ] , 


where means the anticommutator - see Eq. (4.34). Comparing the two double sums 

participating in this expression with Eqs. (108) and (111), we see that they are nothing else than, 
respectively, the symmetrized correlation function and the Green’s function (multiplied by hi 2) of the 
time-difference argument z = t — t’> 0. As the result, Eq. (177) takes a very simple form: 




\K F (t-tfx(t)Xx{t%wXt')]]dt' 


t 


(7.181) 


Density 

matrix’ 

time 

evolution 


Let me hope that the reader enjoys this beautiful result as much as I do, and that it is a sufficient 
intellectual award for his or her effort of following its derivation. It gives a self-sufficient equation for 
time evolution of the density matrix of the system of our interest ( s ), with the effects of its environment 
represented only by two real algebraic functions of r - one (Kp) describing environment’s fluctuations 
and another one (G) representing its the average response to system’s dynamics. And most 
spectacularly, these are exactly the same functions as participate in the Heisenberg-Langevin approach 
to the problem, and hence related to each other by the fluctuation-dissipation theorem (134). 


After a short celebration, let us acknowledge that Eq. (181) is still an integro-differential 
equation that needs to be solved together with Eq. (169). Such equations do not allow explicit analytical 
solutions except for very simple (and not very interesting) cases. For most applications, further 
simplifications should be made. One of them is based on the fact (which was already discussed in Sec. 
3) that both environmental functions participating in Eq. (181) tend to zero when their argument z 
becomes larger that certain environment correlation time z c , which is frequently much shorter that the 
time scales T nn - of the evolution of the density matrix elements. Moreover, the characteristic time scale 
of the coordinate operator evolution may be also short on the scale of T nn -. In this limit, all arguments t ’ 
of the density operator giving substantial contributions to the right-hand part of Eq. (172) are so close to 
t that it does not matter whether its argument is t’ or just t. This simplification (t’ — > t) is known as the 
Markov approximation . 63 However, this approximation alone is still insufficient for finding the general 
solution of Eq. (181). Substantial further progress is possible in two important cases. 


The most important of them is when the intrinsic Hamiltonian H s of our system of interest is 
time-independent and has a very discrete eigenenergy spectrum E,„ 64 with well-separated levels: 


63 Named after A. Markov (1856-1922; in older literature, “Markoff’), because the result of this approximation is 
a particular case of the Markov process whose future development is completely determined by its present state. 

64 Rather reluctantly, I will use this standard notation, E„, for the eigenenergies of our system of interest (5), in 
hope that the reader would not confuse these discrete energy levels with the quasi-continuous energy levels of its 
environment, participating in particular in Eqs. (108) and (111). As a reminder, by this stage of our calculations 
the environment levels have disappeared, leaving behind their “trace functions” Kp r) and G(r). 
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£ -£„, » 


(7.182) 


Let us see what does this condition yield for Eq. (181) rewritten for the matrix elements in the stationary 
state basis (from this point on, I will drop index s for brevity): 


i t t 

Wnn' = ~^\ K p( t - [x(t'),w\ ] nn , dt '-^\ G ( { {x(f), w\ ], m . dt' ; (7. 1 83) 


after spelling out the commutators, it includes 4 operator products, which differ “only” by the operator 
order. Let us have a good look at the first product, 

(x(t)x(t')w) nn , = ^x nm (t)x mm ,(t')w m , n , , (7. 184) 

m,m' 


where indices m and m ’ run over the same set of eigenenergies of the system s of our interest as indices 
n and n According to Eq. (169) with a time-independent H s , matrix elements x„„> (in the stationary state 
basis) oscillate in time as exp {ico nn t}, so that 

(x(t)x(t' )w) nn , = Y, X nm X «m' eX P( Z '(®„„/ + Kv > ( 7 - 1 85) 

m,m' 

where the coordinate matrix elements are in the Schrodinger picture now, and I have used the natural 
notation (6.85) for the quantum transition frequencies: 

*a nn ^E n -E n ,. (7.186) 


According to condition (182), frequencies co nn ’ with n ^ n’ are much higher than the speed of evolution 
of the density matrix elements (in the interaction picture!) - in both the left-hand and right-hand parts of 
Eq. (183). As we already know from Sec. 6.5, this means that in the right-hand part of Eq. (183) we may 
keep only the terms that do not oscillate with frequencies co, m ’, because they would give negligible 
contribution to the density matrix dynamics. 65 For that, in the double sum (185) we may keep only the 
tenns proportional to difference ( t - t ’), because they will give (after integration over t’) a slowly 
changing contribution to the right-hand part. 66 These terms should have co nm + (o mm ■ = 0, i.e. (E„ - E m ) + 
(E m - E m ) = E n - E m - = 0. For a non-degenerate energy spectrum, this requirement means m ’ = n; as a 
result, the double sum is reduced to a single one: 

(x(t)x(t')w) nn , * Y, X nrn X , nn ~ ^)K«' = X I ^ T (* ~ • ( 7 ‘ 187 ) 

m m 

Another product, {wx(t' )x(t)\ n , , that appears in the right-hand part of Eq. (183), may be simplified 
absolutely similarly, giving 

(wx(t')x(t) ~t))w nn , . (7.188) 


65 This is essentially the same rotating-wave approximation (RWA) that is so instrumental in other fields of not 
only quantum mechanics, but classical physics as well - see, e.g., CM Secs. 4. 2-4. 5. 

66 As was already discussed in Sec. 4, the lower- limit substitution (t’ = - oo) in integrals (174) gives zero, due to 
the finite-time “memory” of the system, expressed by the decay of the correlation and response functions at large 
values of the time delay r=t-t’. 
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These expressions hold true whether n and n ’ are equal or not. The situation is different for two 
other products in the right-hand part of Eq. (183), with w sandwiched between x and x’. For example, 

(x(t)wx(t')) , = Vx (t)w ,x , ,h) = Yx w ,x , ,exp {i(co t + co ,7')}. (7.189) 

\ V / V / Jnn nm V ' mm mn V ’ nm mm mn r ( \ nm mn /) V ' 

m,m' m,m' 

For this tenn, the same requirement of having a fast oscillating function of (t- t’) only yields a different 
condition: co nm + co m ’ n ’ = 0, i.e. 

(E,-E m )+(E m ,-E,,) = 0. (7.190) 

Here the double sum reduction is possible only if we make an additional assumption that all interlevel 
energy distances are unique, i.e. our system of interest has no equidistant levels (such as in the harmonic 
oscillator). For diagonal elements (n = n ’), the RWA requirement is reduced to m = m ’, giving sums 
over all diagonal elements of the density matrix: 

(x(t)wx(t') ) nn = Xk,„f exp {ia> nm {t - t')}w mm . (7. 191) 


(Another similar term (x(t')wx(t )) nn , is just a complex conjugate of Eq. (191).) However, for off- 

diagonal matrix elements (n ^ n ’), the situation is different: Eq. (190) may be satisfied only if m = n 
and also m’ = n so that the double sum is reduced to just one, non-oscillating term: 

(x(t)wx(t ') = x nn w nn ,x nV , for n * ri . (7. 192) 


The second similar term, [x(t')wx{t )\ m , is exactly the same, so that in one of the integrals of Eq. (183), 
these terms add up, while in the second one, they cancel. 


This is why the final equations of evolution look differently for diagonal and off-diagonal 
elements of the density matrix. For the former case (n = n ’), Eq. (183) is reduced to the so-called master 
equation 67 relating diagonal elements w nn of the density matrix, i.e. the energy level occupancies W n : 68 


k, = 2>„fj 


0 L 


- t T K F (4k - K, )(exp{z®„ m r}+ exp{- ico nm r}) 
n 

- 44- K - K , ) (exp{/ C 0 nm r} - exp{- i co nm r}) 
2 n 


(7.193) 


dr. 


where r = t — t Changing the summation index notation from m to n ’, we may 
equation in its canonical form 

w = Y(r , w , -t w ) 

rr n / n'^>n rr n' A n->n' r r n ) ? 
n'^n 

where coefficients 


rewrite the master 


(7.194) 



jjK F (T)cos°) nn ,T-jG(T)si 
n n 


sm&»„„,r 



(7.195) 


Master 

equations 

and 

interlevel 

transition 

rates 


67 The master equations, first introduced to quantum mechanics in 1928 by W. Pauli, are sometimes called the 
“Pauli master equations”, or “kinetic equations”, or “rate equations”. 

68 As Eq. (193) shows, the term with m = n would vanish, and thus may be legitimately excluded from the sum. 
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are called the interlevel transition rates. 69 Equation (194) has a very clear physical meaning of the level 
occupancy dynamics (i.e. the balance of probability flows T W) due to the quantum transitions between 
the energy levels (Fig. 6), in our current case caused by the interaction between the system of our 
interest and its environment. 70 


higher levels 



W„ 


W„, 


Fig. 7.6. Probability flows between the energy 
levels, described by the master equation (186). 


The Fourier transforms (113) and (123) enable us to express two integrals in Eq. (195) via, 
respectively, the symmetrized spectral density S/-( co) of environment force fluctuations and the 
imaginary part %”(co) of the generalized susceptibility, both at frequency co = co nn \ After that we may use 
the fluctuation-dissipation theorem (134) to exclude the former function, getting finally 


Transition 
rates via 
generalized 
susceptibility 



1 , 


x' 


"teJ 


coth 


b<t>nn' 
2 kj 


2 1 .2 


(7.196) 


Note that since the imaginary part of the generalized susceptibility is an odd function of 
frequency, Eq. (196) is in compliance with the Gibbs distribution for arbitrary temperature. Indeed, 
according to this equation, the ratio of “up” and “down” rates for each pair of levels equals 


r_ . 


X(®nn) 


-/- 


x{a> n „) 


exp ( ( E n - E n - ) / K T ) - 1 exp { {E n , -E n )/k B T }- 1 


cxpl 


e k -e„. 

kj 


(7.197) 


On the other hand, according to the Gibbs distribution (23), in thermal equilibrium the level populations 
should be in the same proportion, satisfying the so-called detailed balance equations. 


Detailed 

balance 

equations 


wr 


= w, r , 

n n ->n ? 


(7.198) 


for each pair {n, n so that all right-hand parts of all Eqs. (194) could vanish - as they should. Thus, 
the stationary solution of the master equations indeed describes the thermal equilibrium. 


The closed system of master equations (194), sometimes complemented by additional right- 
hand-part terms that describe interlevel transitions due to other factors (e.g., by an external ac force with 
a frequency close to one of co nn >), is the key starting point for practical analyses of many quantum 


69 As Eq. (193) shows, the result for T„^„' is described by Eq. (195) as well, provided that indices n and n ’ are 
swapped in all components of its right-hand part, including the swap <w„„- — > (»„ ■„ = -co nn \ 

70 It is straightforward to show that at relatively low temperatures (k B T «\E„- - E „\ ), Eq. (196) gives the same 
result as the Golden Rate formula (6.134) - see Exercise 2. (The low temperature limit is necessary to ensure that 
the initial occupancy of the excited level is negligible, as was assumed at the derivation of Eq. (6.134).) 
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systems including quantum generators (masers and lasers). It is important to remember that it is strictly 
valid only in the rotating- wave approximation, i.e. if Eq. (182) is well satisfied for all n and n 


For a particular (but very important) case of a two-level system (say, with E\ > Ef) in the low- 
temperature limit k\>J « ti co\ 2 = E\ - Ei, rate T i _>2 » r 2 ^ 1 defines the characteristic time T\ = 1/F i _>2 
of the energy relaxation process that brings the diagonal elements of the density matrix to their 
thermally-equilibrium values (23). For the Ohmic dissipation described by Eq. (138), Eq. (196) yields a 
simple expression 



(7.199) 


Of course, time 7) should not be confused with the characteristic time 73 of relaxation of the off- 
diagonal elements, i.e. dephasing, which was already discussed in Sec. 3. By the way, let us see what do 
Eqs. (183) say about the dephasing rate. Taking into account our intermediate results ( 1 87)-(l 92), and 
merging the non-oscillating components (with m = n and m = n ’) of sums Eq. (187) and (188) with the 
terms (192), that also do not oscillate in time, we get the following equation: 71 


w 


r°° r 1 f 2 

- =-\ J it k f(t) ZkJ 2 ex P { ia n,nA + 2>„J 2 exp {-ia) n , m T}+(x nn -x n J 

I 0 \m*n m^n' 

+ Zl X nS eX P {ia>nmA- eX P{“ 

\m*n m*n' 


V 

} 


dr 

)_ 

J J 


(7.200) 


dr>w nn ,, for n^n'. 


In contrast with Eq. (194), the right-hand part of this equation includes both a real and an imaginary 
part, and hence it may be presented as 


w. 


=~( 1/T „n' + * A m 'K 


(7.201) 


where both factors l/7j,„- and A, m - are real. 72 As should be clear from Eq. (201), the second term in the 
right-hand part of this equation causes slow oscillations of the matrix elements w nn •, that, after returning 
to the Schrodinger picture, add just small corrections 73 to the unperturbed frequencies (183) of their 
oscillations, and are hence are not important for most applications. More important is the first term, 


71 Because of the reason explained above, this (relatively :-) simple result is not valid for systems with equidistant 
energy spectra, most importantly, for the harmonic oscillator (while Eq. (7.194) is). For the oscillator, with its 
simple matrix elements x nn •, it is straightforward to repeat the above calculations, starting from (7.183), to obtain 
an equation similar to Eq. (7.200), but with two other terms, proportional to w n ± in its right-hand part. Since 
for the harmonic oscillator the Heisenberg-Langevin approach allows obtaining most results in a much simpler 
way, I will skip the derivation of this equation and the discussion of its solutions. The interested reader may find 
such a discussion, for example, in a paper by B. Zeldovich et ai, Sov. Phys. JETP 28 , 308 (1969). 

72 Sometimes Eq. (200) (in any of its numerous alternative forms) is called the Redfield equation, after the 1965 
work by A. Redfield. Note, however, that several other authors, notably including (in the alphabetical order) H. 
Elaken, W. Lamb, M. Lax, W. Louisell, and M. Scully, also made key contributions into the very fast 
development of the density-matrix approach to open quantum systems in the mid-1960s. 

73 This correction is frequently called the Lamb shift, because it was first observed experimentally in 1947 by W. 
Lamb and R. Retherford, as a minor, ~1 GElz shift between energy levels of 2s and 2 p states of hydrogen, due to 
the electric-dipole coupling of hydrogen atoms to the free-space electromagnetic environment. (These levels are 
equal not only in the nonrelativistic theory (Sec. 3.6), but also in the relativistic, Dirac theory (Sec. 9.7), if the 
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(7.202) 


General 

result 

for 

dephasing 

rates 


because it describes the effect absent without the environment: an exponential decay of the off-diagonal 
matrix elements, i.e. dephasing. Comparing the first 2 terms of Eq. (202) with Eq. (195), we see that the 
dephasing rates may be described by a very simple formula: 


I r ,„„ + XX,_„ )+£(*„ fsfro) 

\m*n m^n' J 

r A k u T 


(7.203) 


Ir„„ + in 

\m*n 


m*n' J 
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-7 


( X nn~ X n'n') 2 ’ for » * « ' , 


where the low-frequency viscosity coefficient 7 is again defined as lim^o %”(ai)l(Q- see Eq. (138). 


This result shows that two effects yield independent contributions into dephasing. The first of 
them may be interpreted as a result of the “virtual” transitions of the system to other energy levels m; 
according to Eq. (187), it is proportional to the strength of coupling to environment at relatively high 
frequencies co nm and <x> n ’ m . (If the energy quanta hcoo f these frequencies are much larger than the thermal 
fluctuation scale k^T, only the lower levels, with E m < max[/:„, E n ■] are important.) On the contrary, the 
second contribution is due to low-frequency, essentially classical fluctuations of the environment, and 
hence to the low-frequency dissipative susceptibility. If the susceptibility (more exactly, the ratio 7 = 
X”(co)!(o) is frequency-independent, both contributions are of the same order, but their exact relation 
depends on the relation between the matrix elements x nn - of a particular system. 


Returning again to the two-level system discussed in Sec. 3, the high-frequency contributions 
vanish because of the absence of transitions between its energy levels, while the low-frequency 
contribution yields 
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V, 


(7.204) 


thus exactly reproducing the result (142) of the Heisenberg-Langevin approach . 74 Note also that Eq. 
(204) for T 2 is very close in structure to Eq. (199) for 7). For our simple interaction model (70), the off- 
diagonal elements of operator x = &, in the stationary-state z-basis vanish, so that T\ — > 00 . For the two- 
well implementation of the model (see Fig. 4 and its discussion), this result corresponds to a very high 
energy barrier between the wells, that inhibits tunneling, and hence any change of well occupancies Wl 


electromagnetic environment is ignored.) The explanation of the shift, by H. Bethe in the same 1947, has 
launched the whole field of quantum electrodynamics - to be briefly discussed in Chapter 9. 

74 The first form of Eq. (203), as well as the analysis of Sec. 3, imply that low-frequency fluctuations of any other 
origin, not taken into account in own current calculations (say, unintentional noise from experimental equipment), 
may also cause dephasing; such “technical fluctuations” are indeed a serious challenge at the experimental 
implementation of coherent qubit systems - see Sec. 8.5 below. 
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and Wr. However, 7j may become finite, and comparable with 7?, if tunneling between the wells is 
substantial. 75 


Now let us briefly discuss dissipative systems with continuous spectrum. Unfortunately, for them 
the only (relatively :-) simple results that may be obtained from Eq. (181) are essentially classical in 
nature. As an illustration, let us consider the simplest example of a ID particle that interacts with a 
thennally-equilibrium environment, but otherwise is free to move (unconfined). As we know from 
Chapters 2 and 5, in this case the most convenient basis is that of momentum eigenstates p. In the 
momentum representation, the density matrix is just the c-number function w(p, p ’), defined by Eq. (54), 
that has already been discussed in brief in Sec. 2. On the other hand, the coordinate operator, that also 
participates in the right-hand part of Eq. (181), has the form given by the first of Eqs. (5.64), 

x = ih —— , (7.205) 

dp 


dual to the coordinate representation formula (5.29). As we already know, such operators are local - see, 
e.g., Eq. (5.28b). Due to this locality, the whole right-hand part of Eq. (181) is local as well, and hence 
(within the framework of our perturbative treatment) the interaction with environment affects essentially 
only the diagonal values w(p, p) of the density matrix, i.e. the momentum probability density w(p). Let 
us find the equation governing the evolution of this function in time. 


Generally, in the interaction picture, matrix elements of operators x and w acquire some time 
dependence, but in the limit p’ — > p, this dynamics lacks the high frequencies (186) that have been so 
helpful for the derivation of master equations. As a result, the only serious simplification of Eq. (181) is 
possible in the Markov approximation, when the time scale of the density matrix evolution is much 
longer than the correlation time t c of the environment, i.e. the time scale of functions K/i f) and G( r). In 
this approximation, we may take the matrix elements out of the first integral of Eq. (181), 


i t i °° 

- — [ K F (t- t')dt'[x{t), [x(f ), w{t ' )] ] * f K f (r) dr[x, [x, w] ] 

ft i ft 0 

= ~~^ S f (o)[-U [x, w] ] = ~ kj ^- 77 [x, [x, w] ] , 


(7.206) 


and calculate the double commutator in the Schrodinger picture. This may be done either using an 
explicit expression for the matrix elements of the coordinate operator, dual to Eq. (5.28b), or in a 
simpler way, using the same trick as at the derivation of the Ehrenfest theorem in Sec. 5.2. Namely, 
expanding an arbitrary function J{p) into the Taylor series in one of its arguments (say, p). 


f(P ) = z 


k = 0 


1 d k f 

k\ dp k 


(7.207) 


and applying Eq. (205) to each tenn, we can prove the following simple commutation relation: 


75 The tunneling may be described without altering Eq. (70), just by adding, to the unperturbed Hamiltonian (69), 
terms proportional to other Pauli matrices. The reader is encouraged to spell out the equations for the time 
evolution of the density matrix elements of this system, and analyze their main properties - at least in the low- 
temperature limit. 
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dp 


Now applying this result sequentially, first to w(p,p’) and then to the resulting commutator, we get 

0 2 w 
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(7.209) 


It may look like the second integral in Eq. (181) might be simplified similarly. However, it 
vanishes at p ’ — » p, and t’ — > t, so that in order to calculate the first nonvanishing contribution from that 
integral for p = p we have to take into account the small difference r = t - t’ ~ r c between the 
arguments of the coordinate operators under that integral. This may be done using Eq. (169) with the 
free-particle Hamiltonian consisting of the kinetic -energy contribution alone: 


x{t')-x{t ) * 
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(7.210) 


where the exact argument of the operator in the right-hand part is already unimportant, and may be taken 
for t. As a result, we may use the last of Eqs. (136) to reduce the second term in the right-hand part of 
Eq. (181) to 


• i • w 

^ j G(t -tfx(t),{x(t'),w(t')}]dt' ~ G(r)rdr 


2 tv 
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lih 




. (7.211) 


In the momentum representation, the momentum operator and the density matrix w are just c-numbers 
and commute, so that, applying Eq. (208) to product pw, we get 
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(7.212) 


and may finally reduce the integro-differential equation Eq. (181) to a partial differential equation: 

Fokker -Plank 
equation 
for free 
ID particle 

This is the ID form of the famous Fokker-PIanck equation describing the classical statistics of 
motion of a free ID particle in a medium with linear viscosity p. The first, drift term in the right-hand 
part of Eq. (213) describes particle’s deceleration due to the average viscous force (137), (F) = - 77 V = - 
r/p/m, provided by the environment, while the second, diffusion tenn describes the effect of fluctuations: 
particle’s random walk that obeys Eq. (85) with the diffusion coefficient 



(7.213) 


Einstein 

relation 


D = pkfT . 


(7.214) 


This fundamental Einstein relation, 16 shows again the intimate connection between the dissipation 
(viscosity) and fluctuations, in this classical limit represented by their thermal energy scale k^T 11 


76 It was the main result of A. Einstein’s pioneering analysis of such Brownian motion in 1905. (The development 
of this analysis in 1906-1908 by M. Smoluchowski has led in 1912 to the Fo kk er-PIanck theory.) 
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Just for reader’s reference, let me note that the Fokker-Planck equation (213) may be readily 
generalized to the 3D motion of a particle under the effect of an additional external force F ex t(r, /): 78 


(7.215) 
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Planck 

equation 


where w = w(r, p, t) is the time-dependent probability density in the 6D phase space, and V p is the 
nabla/del operator of differentiation over the momentum components, defined similarly to its coordinate 
counterpart V. The Fokker-Planck equation in this form is the basis for many important applications; 
however, due to its classical character, its discussion is left for the SM part of my lecture notes. 79 


To summarize our discussion of the two alternative approaches to the analysis of quantum 
systems interacting with a thermally-equilibrium environment, described in the last three sections, let 
me emphasize that they give descriptions of the same phenomena, and are characterized by the same two 
functions G(f) and Kjfz), but from two different points of view. Namely, in the Heisenberg-Langevin 
approach we describe the system by operators that change (fluctuate) in time, even in thermal 
equilibrium, while in the density-matrix approach the system is described by non-fluctuating probability 
functions, such as W n (t) or w(p), that are stationary in equilibrium. In the (relatively rare) cases when a 
problem may be solved by either method, they give identical results for all observables. 


7.7. Quantum measurements 

Now we have got a sufficient quantum mechanics background for a brief discussion of quantum 
measurements . 80 Let me start with reminding the reader the only postulate of quantum mechanics that 
relates this theory with experiment. In Chapter 4 it was formulated for a pure state described with ket- 
vector 

\ a ) = Ys a j\ a j) ’ (7.216) 

j 


77 This classical relation may be derived using several other ways - including those much simpler than used 
above. For example, since the Brownian particle’s motion may be described by a linear Langevin equation, Eq. 
(214) may be readily obtained from the Nyquist formula (139) - see, e.g., SM Sec. 5.5. 

78 Moreover, Eq. (213) may be generalized to the motion in an additional periodic potential (7(r). In this case, an 
analog of Eq. (215) for the probability density of quasi-momentum tiq (rather than the genuine momentum p) 
includes an additional energy band index (say, n), an additional force F„= -VE n (where E„(q) is the energy band 
structure that was discussed in Secs. 2.7 and 3.4), and an additional term similar to the right-hand part of Eq. 
(194), describing interband transitions with quasi-momentum-dependent rates F„^„ (q). These rates are still 
expressed by Eq. (196), but with the matrix elements x nn - replaced by those of the vector operator Q = r - z’V of 

interband transitions, which was discussed in Chapter 5. For details and a particular example of a sinusoidal 
potential see, e.g., K. Likharev and A. Zorin, J. Low Temp. Phys. 59, 347 (1985). 

79 For a more detailed analysis and several examples of quantum effects in dissipative systems with continuous 
spectra see, e. g., U. Weiss, Quantum Dissipative Systems, 2 nd ed., World Scientific, 1999, or H.-P. Breuer and F. 
Petruccione, The Theory of Open Quantum Systems, Oxford U. Press, 2007. 

80 “Quantum measurements” is a very unfortunate term; it would be more sensible to speak about “measurements 
of quantum mechanical observables”. However, the former term is so co mm on and compact that I will use it. 
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where a,- and Aj are, respectively, the eigenstates of the operator of observable A, defined by Eq. (4.68). 
According to the postulate, the outcome of each particular measurement of observable A may be 
uncertain, 81 but is restricted to the set of eigenstates Aj, with the probability of outcome Aj equal to 

W j =\a j \ 2 . (7.217) 

Since we kn ow now that the state of the system (or rather of the statistical ensemble of similar systems 
we are using for measurements) is generally not pure, this postulate should be re-worded as follows: 
even if the system is in the least uncertain state (216), the measurement outcomes are still probabilistic, 
and obey Eq. (133). 82 

Quantum measurement may be understood as a procedure of transferring the “microscopic” 
information contained in coefficients a, into “macroscopically” available infonnation about the 
outcomes of particular experiments, that may be recorded and reliably stored - say, on paper, or in a 
computer, or in our minds. If we believe that such transfer may be always done well enough, and do not 
worry too much how exactly, we are subscribing to the mathematical notion of measurement, that was 
(rather reluctantly) used in these notes - up to this point. However, every physicist should understand 
that measurements are performed by physical devices that also should obey the laws of quantum 
mechanics, and it is important to understand the basic laws of their operation. 

The founding fathers of quantum mechanics have not paid much attention to these issues, 
probably because of the following two reasons. First, at that time it looked like the experimental 
instruments (at least the best of them :-) were doing exactly what postulate (217) was telling. For 
example, had not the z-oriented Stern-Gerlach experiment turned two complex coefficients at and at, 
describing the incoming electron beam, into particle counter clicks with rates proportional to, 
respectively, \a\\~ and \a\\ ? Also, the crude internal nature of these instruments made more detailed 
questions unnatural. For example, the electron rate counting with a Geiger counter involves an effective 
disappearance of each incoming electron inside a zillion-particle electric discharge avalanche. Thinking 
about such devices, it was hard to even imagine measurements that would not disturb the quantum state 
of the particle being measured. 

However, since that time the experimental techniques, notably including high vacuum, low 
temperatures, and low-noise electronics, have much improved, and eventually more inquisitive 
questions started to look not so hopeless. In my scheme of things, these questions may be grouped as 
follows: 

(i) What are the main laws of a quantum measurement as a physical process? In particular, 
should it always involve time irreversibility? a human/intelligent observer? (The last question is not as 
laughable as it may look - see below.) 

(ii) What is the state of the measured system just after a single-shot measurement - meaning the 
measurement process limited to a time interval much shorter that the time scale of measured system’s 
evolution? This question is naturally related to the issues of repeated measurements and continuous 
monitoring of system’s state. 


81 Besides the trivial case a f = Sjj’ (so that Wj = f j, when the system is in a certain eigenstate (a j) of operator A . 

82 The reader in doubt is invited to compare entropy S= -ZjWjkiWj, the measure of system’s disorder (see, e.g., 
SM Sec. 2.2) of the pure state (5?= 0) with that in any state with several nonvanishing values of Wj (S> 0). 
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(iii) If a measurement of observable A produced a certain outcome A - h can we believe that the 
system had been in the corresponding state a, just before the measurement? 

The last question is most closely related to various interpretations of quantum mechanics, and 
will be discussed in the concluding Chapter 10, and now let me provide some input on the first two 
groups of issues. 

First of all, I am happy to report that these is a virtual consensus of physicists on the two first 
questions of series (i). According to this consensus, any quantum measurement needs to result in a 
certain, distinguishable state of a macroscopic output component of the measurement instrument - see 
Fig. 7. (Traditionally, its component is called a pointer, though its role may be played by a printer or a 
plotter, an electronic circuit sending out the result as a number, etc.). 

This requirement implies that the measurement process should have the following features: 

- be time-irreversible , 

- provide large “signal gain”, i.e. mapping the quantum process with its h - scale of action (i.e. of 
the energy-by-time product) onto a macroscopic motion of the pointer with a much larger action scale, 
and 


- if we want high measurement fidelity, the process should introduce as little additional 
uncertainty as permitted by the law of physics. 



system 


necessary 

interaction 

► 


<■ 


back action 



to human 
observer 


macroscopic 

pointer 


Fig.7.7. General scheme of 
quantum measurement. 


All these requirements are fulfilled in a good Stern-Gerlach experiment. However, since the 
internal physics of the particle detector at this measurement is rather complex, let me give an example of 
a different, more simple single-shot scheme 83 capable of measuring the instant state of a typical two- 
level system, for example, a particle in a double quantum well potential (Fig. 8). 84 Let the system be, at t 
= 0, in a pure quantum state described by ket-vector 

|ct'^ = cr_ > | — ^ + — ''j , (7.218) 


83 This scheme may be implemented, for example, using a simple Josephson-junction circuit called the balanced 
comparator - see, e.g., T. Walls et al., IEEE Trans, on Appl. Supercond. 17, 136 (2007), and references therein. 
Experiments by V. Semenov et al., IEEE Trans. Appl. Supercond. 7, 3617 (1997) have demonstrated that this 
system may have measurement accuracy dominated by quantum-mechanical uncertainty at relatively modest 
cooling (to ~ IK). One of advantages of such implementation of this measurement scheme is that it is based on 
externally- shunted Josephson junctions - devices whose quantum-mechanical model is in a quantitative 
agreement with experiment - see, e.g., D. Schwartz et al., Phys. Rev. Lett. 55, 1547 (1985). Colloquially, the 
balanced comparator is an instrument with a “well-documented Hamiltonian” including its part describing 
coupling to environment. 

84 As a reminder, dynamics of this system was discussed in Sec. 2.6 and then again in Sec. 6.1. 
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where the component states — > and <— may be described by wavefunctions localized near the potential 
well bottoms at x s ~ ±xo - see the blue lines in Fig. 8b. Let us rapidly change the potential profile of the 
system at t = 0, so that at / > 0, and near the origin, it may be well approximated by an inverted parabola 
(see the red line in Fig. 8b): 

U(x s ) » - x * , at t > 0, |xj«x / . (7.219) 



Fig. 7.8. Potential inversion on (a) “macroscopic” and (b) “microscopic” scales of coordinate x. 

It is straightforward to verify that the Heisenberg equations of motion in such inverted potential 
describe an exponential growth of operator x s in time (proportional to exp {At} and hence a similar 

growth of the expectation value (x s ) and its r.m.s. uncertainty &,. 85 At this “inflation” stage, the 
coherence between the two component states — > and <— is still preserved, i.e. the time evolution is 
reversible. 

Now let the system be weakly coupled to a dissipative (e.g., Ohmic) environment. As we already 
know, the environment performs two functions. First, it provides motion with viscosity r) (141), so that 
the system would eventually come to rest at one of the relatively distant minima, ±x/, of the inverted 
potential (Fig. 8a). Second, the dissipative environment ensures state’s dephasing on some time scale Ti. 
If we select the measurement system parameters in such a way that 

x 0 « x 0 exp {AT 2 } « x f , (7.220) 

then the process, after the potential inversion, consists of the following stages, well separated in time: 

- the “inflation” stage, preserving the component state coherence but providing an exponential 
increase of its energy, 


85 Somewhat counter-intuitively, the latter growth plays a positive role for measurement fidelity. Indeed, it does 
not affect the intrinsic “signal-to-noise ratio” SxJ{x s ), while making the intrinsic (say, quantum-mechanical) 
uncertainty much larger that possible noise contribution by the latter measurement stage(s). 
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- the dephasing stage, at which the coherence is suppressed, and the density matrix of the system 
is reduced to a diagonal fonn describing the classical mixture of the probability packets propagating to 
the left and to the right, and 

- the stage of settling to a new stationary state - a classical mixture of two states located near 
points x s = ±Xf, with probabilities (217) equal to, respectively, = |a_>| and IF<_ = \a<-\ = 1 - |a_>| . 

If the final states are macroscopically distinguishable (i.e. may play the role of a bistable 
pointer), as they are in the balanced-comparator implementation, there is absolutely no need, at any of 
these stages, to involve any mysterious “another mechanism of wavefunction change” (different from 
the regular, Schrodinger evolution) for the measurement process description. 

This may be the only appropriate time to mention, very briefly, the famous - or rather infamous 
Schrodinger cat paradox so much overplayed in popular press. (The only good aspect of this popularity 
is that the formulation of this paradox is certainly so well known to the reader, that I do not need to 
repeat it.) In this thought experiment, there is no need to discuss the (rather complex :-) physics of the 
cat. As soon as the charged particle, produced at the radioactive decay, reaches the Geiger counter, the 
process rapidly becomes irreversible, so that the coherent state of the system is reduced to a classical 
mixture of two possible states: “decay” - “no decay”, leading, correspondingly, to the “cat alive” - “cat 
dead” states. So, despite attempts by numerous authors, typically without proper physics background, to 
present this situation as a mystery whose discussion needs the involvement of professional philosophers, 
hopefully by this point the reader knows enough about dephasing to pay any attention. Let me, however, 
note the two non-trivial features of this gedanken experiment, that are met in most real experiments as 
well, including that with the potential inversion (Fig. 8). 

First, the role of the measured coordinate of the system under observation ( 5 ) may be played not 
by a coordinate of a single fundamental particle, but a certain combination of coordinates of many 
microscopic components of a macroscopic body. In particular, in Josephson junction systems such as the 
balanced comparator we essentially measure the persistent electric current (“supercurrent”) - a certain 
linear combination of Cartesian components of the momenta of the electrons that constitute the Bose- 
Einstein condensate of Cooper pairs. At that, the role of the local environment (that contributes 
significantly to dissipative phenomena) is played by the same electrons, with other linear combinations 
of electron momenta playing the role of environmental degrees of freedom - which were called {A} in 
the last few sections. This makes the coupling to environment somewhat less apparent (at least for the 
people who do not know what a linear combination is :-). 

Second, one may argue that even after the balanced comparator (in our first example) or the cat 
(in the second example) has reached its final macroscopic state, human observer’s realization that in this 
particular experiment the bistable pointer is in a certain state instantly decreases the probability (for the 
same observer!) of its being in the opposite state to zero. However, as was already discussed in Sec. 2.5, 
this is a very classical problem of the statistical ensemble redefinition that may be (or may be not) 
performed at observer’s will. Such redefinition, if performed, is the only possible role of a human (or 
otherwise intelligent :-) observer in the measurement process; if we are only interested in an objective 
recording of results of a pre-fixed series of similar experiments, there is no need to include such 
observer into any discussion. 
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The ensemble redefinition at measurement leads to several other paradoxes, of which the so- 
called quantum Zeno paradox is perhaps most spectacular. 86 Let us return to a two-level system with the 
unperturbed Hamiltonian given by Eq. (4.166), with 2^/Q much larger than the single-shot measurement 
time, and the system initially (at t = 0) is in a certain quantum well. Then, as we know from Secs. 2.6 
and 4.6, before the first measurement, the probability to find state in the initial state at time t is 

W(t) = cos 2 Qt. (7.221) 


If the time is small enough ( t = dt « I/O), we may use the Taylor expansion to write 


W(dt) « 1 


OJdt 2 

4 


(7.222) 


Now, let us return the two-level system, after its measurement, into the same quantum well, and 
let it evolve with the same Hamiltonian. Since the occupation of the opposite state is very small, the 
evolution of IF will closely follow the same law as in Eq. (221), but with the initial value given by Eq. 
(222) Thus, when the system is measured again at time 2 dt. 


( 

W{2dt)~W{dt) 1 

v 


Q 2 dt 2 ' 

‘ 4 J 


f 

1 

v 


n 2 dt 2 Y 

4 J' 


(7.223) 


1/9 

After repeating this cycle N times (with the total time t = Ndt still much less than N / O), the probability 
that the system is still in the initial state is 
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W{Ndt) = W{t )« 1 
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O 2 dt 2 ' N 
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fiV 

4 N ' 


(7.224) 


Comparing this result with Eq. (222), we see that the process of system transfer to the opposite quantum 
well has been slowed down rather dramatically, and in the limit A — > oo (at fixed t), its evolution is 
completely stopped by the measurement process. There is of course nothing mysterious here; the 
evolution slowdown is due to statistical ensemble’s redefinition. 

Now let me proceed to question group (ii), in particular to the general issue of the back action of 
the instrument upon the system under measurement (symbolized with the back arrow in Fig. 7). In 
instruments like the Geiger counter or the balanced comparator, such back action is very large, because 
the instrument essentially destroys (“demolishes”) the initial state of the system under measurement. 
However, in the 1970s it was realized that this is not really necessary. In Sec. 3, we have already 


86 This name, coined by E. Sudarshan and B. Mishra in 1997 (though the paradox had been discussed in detail by 
A. Turing in 1954); is due to the apparent similarity of this paradox to classical paradoxes by ancient Greek 
philosopher Zeno of Elea. By the way, just to have a minute of fun, let us have a look what happens when Mother 
Nature is discussed by people to do not understand math and physics. The most famous of the classical Zeno 
paradoxes is the Achilles and Tortoise case: a fast runner Achilles can apparently never overtake a slower 
Tortoise, because (in the words by Aristotle) “the pursuer must first reach the point whence the pursued started, so 
that the slower must always hold a lead”. For a physicist, the paradox has a trivial resolution, but let us listen what 
a philosopher (D. Burton) writes about it - not in some year BC, but in 2010 AD: "Given the history of 'final 
resolutions', from Aristotle onwards, it's probably foolhardy to think we've reached the end.” For me, this is a sad 
symbol of modem philosophy. 
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discussed an example of a two-level system coupled with environment (in our current context, with 
measurement instrument) and described by Hamiltonian 

H = H,+H m +H,{ 4 with H s = aa_, H lri , = (7.225) 

so that 




= 0 . 


(7.226) 


Comparing this equality with Eq. (67) we see that in the Heisenberg picture, the Hamiltonian operator 
(and hence the energy) of the system of our interest does not change with time. On the other hand, the 
interaction can change the state of the instrument, so it may be used to measure its energy - or another 
observable whose operator commutes with the interaction Hamiltonian. Such trick is called either the 
quantum non-demolition (QND) or back-action-evading (BAE) measurements. 87 Let me present a fine 
example of a real measurement of this kind - see Fig. 9. 88 
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Fig. 7.9. QND measurement of single electron’s energy by Peil and Gabrielse: (a) the core of experimental 
setup, and (b) a record of the thermal excitation and spontaneous relaxation of Fock states. © 1999 APS. 


In this experiment, a single electron is captured in a Penning trap - a combination of a (virtually) 
uniform magnetic field B and a quadrupole electric field. 89 Such electric field stabilizes cyclotron orbits 
but does not have any noticeable effect on electron motion in the plane perpendicular to the magnetic 
field, and hence on its Landau level energies (see Sec. 3.2): 


87 For a detailed survey of this field see, e.g., either V. Braginsky and F. Khalili, Quantum Measurements, 
Cambridge U. Press, 1992, or H. Wiseman and G. Milbum, Quantum Measurement and Control, Cambridge u. 
Press, 2010. 

88 S. Peil and G. Gabrielse, Phys. Rev. Lett. 83, 1287 (1999). 

89 Similar to the one discussed in EM Sec. 2.4 (see in particular Eq. (2.77) and Fig. 2.7), but with additional 
rotation about one of the axes - either x or y. 
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E„ = hco. 




n + — 


eB 

, with co r = — . 
‘J m e 


(7.227) 


(In the cited work, at B « 5.3 T, the cyclic frequency ajln was about 147 GHz, so that the level 
splitting ha> c was close to 10'““ J, i.e. corresponded to temperature ~ 10 K, while the physical 
temperature of the system might be reduced well below that, down to ~80 mK). Now note that the 
analogy between a particle on a Landau levels and a harmonic oscillator goes beyond the energy 
spectrum. Indeed, since the Hamiltonian of a 2D particle in a perpendicular magnetic field may be 
reduced to that of a ID oscillator, we may repeat all procedures of Sec. 5.4 and rewrite it in the terms of 
creation-annihilation operators: 


H s =h(o c 


- 1 1 
a 1 a + — . 
2 ) 


(7.228) 


In the Peil and Gabrielse experiment, the electron had one more degree of freedom - along the 
magnetic field. The electric field of the Penning trap creates a soft confining potential along this 
direction (vertical in Fig. 9a; let us take it for axis z), so that small electron oscillations along that axis 
could be well described as a ID harmonic oscillator of much lower eigenfrequency, in that particular 
experiment with t ' oJ2k « 64 MHz. This frequency could be measured very accurately (with error -1 Hz) 
by sensitive electronics whose electric field affects z-motion of the electron, but not its motion in the 
perpendicular plane. In an exactly uniform magnetic field, the two modes of electron motion would be 
completely uncoupled. However, the experimental setup included two special superconducting rings 
made of niobium (Fig. 9a), which slightly distorted the magnetic field and created an interaction 
between the modes, which might be well approximated by Hamiltonian 90 


H mt = const x 


~ 1 

a a + — 


(7.228) 


so that the main condition (226) of a QND measurement was well satisfied. At the same time, coupling 
(228) ensured that a change of the Fock state number n by 1 changed the z-oscillation eigenfrequency by 
-12.4 Hz. Since this shift was substantially larger than electronics noise, spontaneous changes of n (due 
to an uncontrolled coupling of the electron to environment) could be readily observed - moreover, 
continuously monitored - see Fig. 9b. (These data imply that there is virtually no effect of the measuring 
instrument on the statistics on n - at least on the scale of minutes, i.e. as many as -10 cyclotron orbit 
periods.) Of course, any measurement - QND or not - cannot avoid the Heisenberg uncertainty 
relations; in this particular case, a permanent monitoring of the Fock state number n keeps its quantum 
phase fully uncertain. 

It is natural to wonder whether the QND measurement concept may be extended from quadratic 
forms like the energy to “usual” observables such as coordinates and momenta whose uncertainties are 
bound by the fundamental Heisenberg’s relation. The answer is yes, but the required methods are a bit 
more tricky. For example, let us place an electrically charged particle into a uniform electric field £ = 
n x 3{t) of the instrument, so that their interaction Hamiltonian is 


90 I am simplifying the real situation a bit. Actually, in the experiment there was an electron spin’s contribution to 
the interaction Hamiltonian as well, but since the large magnetic field polarized the spins quite reliably, their only 
role was a constant shift of frequency co : . 
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#int = -q#(t)x . 


(7.229) 


Such interaction certainly passes the information on the time evolution of coordinate x to the instrument. 
However, since Eq. (226) is not satisfied - at least for the kinetic-energy part of system’s Hamiltonian; 
as a result the interaction simultaneously distorts the time evolution of particle’s momentum. Indeed, 
writing the Heisenberg equation of motion (4.199) for the x-component of momentum, we get 


p-p 


<?=0 


=qm. 


(7.230) 


Integrating Eq. (5.169) for the coordinate operator evolution, 91 we get expression, 


x(t) = x(0) + 


1 

m 


J p(t')dt ' , 

o 


(7.231) 


that shows that the perturbations (230) of the momentum would eventually find their way to the 
coordinate evolution. 

However, for such an important particular system as a harmonic oscillator, the following trick is 
possible. For this system, Eqs. (5.170) and (230) may be readily combined to give a second-order 
differential equation for the coordinate operator, that is absolutely similar to the classical equation of 
motion, and has a similar solution: 92 

t 

x(t) = x(0) + -2— f &(f ) sin (o 0 (t - t')df . (7.232) 

ma> o J 0 


This formula confirms that generally the external field At) (in our case, the sensing field of the 
measurement instrument) affects the time evolution law. Note, however, that if the field is applied only 
at moments t’„ separated by intervals 772, where T = In! op is the oscillation period, its effect on 
coordinate vanishes at similarly spaced observation instants t n = t n - + (in +1/2)27 This the idea of 
stroboscopic QND measurements. Of course, according to Eq. (230), even such measurement strongly 
perturbs the oscillator momentum, so that even if values x n are measured with high accuracy, the 
Heisenberg’s uncertainty relation is not violated. 

Experimental implementation of such measurements is not simple (and to the best of my 
knowledge they have never been successfully demonstrated), but this initial idea has opened a way to 
more practicable solutions. For example, it straightforward to use the Heisenberg equations of motion to 
show that if coupling of two hannonic oscillators, with coordinates x and X, and unperturbed 
eigenfrequencies a> and Q, is modulated in time as 

77 int oc xX cos cot cosQ t , (7.233) 


91 This simple equation is limited to ID systems with Hamiltonians of the type (2.50), but the reader should agree 
that this is a pretty general form. 

92 See, e.g., CM Sec. 4.1. Note in particular that function sinner (with r =t-t’) under the integral, divided by op, 
is nothing more than the temporal Green’s function G(r), of a loss-free harmonic oscillator. 
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then the process in one of oscillators (say, that with frequency Q) does not affect dynamics of one of the 
quadrature components of another oscillator, defined by relations 93 

A P A f) 

x^xcos cot — —sin cot, x, = xsinnrf + - £ — coscot, (7.234) 

mat ~ mat 

while this component’s motion does affect the dynamics of one of quadrature components of the 
counterpart oscillator. (For the counterpart couple of quadrature components, the information transfer 
goes in the opposite direction.) This scheme has been successfully used for QND measurements in the 
optical range, with coupling (233) provided by the optical Kerr effect. 94 

Please note that the last two QND measurement examples are based on the idea of modulation of 
a certain parameter in time - either in a short-pulse or sinusoidal form. So, the reader should not be 
surprised that if the only role of a QND measurement is a sensitive measurement of a weak classical 
force acting on a quantum probe system, 95 he. a ID oscillator of eigenfrequency oy, it may be 
implemented much simpler - just by modulating the oscillator parameter with frequency co ~ 2 cry. From 
classical dynamics, we know that if the depth of such modulation exceeds a certain threshold value, it 
results in excitation of the so-called parametric oscillations with frequency col 2, and one of two opposite 
phases. 96 In the language of Eq. (234), parametric excitation means an exponential growth of one of the 
quadrature components, with the sign depending on initial conditions, while the counterpart component 
is suppressed. Close to, but below the excitation threshold, the parameter modulation boosts all 
perturbations of the almost-excited component (including its quantum-mechanical uncertainty), and 
suppresses ( squeezes ) those of the counterpart component. The result is a squeezed state, already 
discussed in Sec. 5.5 above (see in particular Fig. 5.6), that allows one to notice the effect of external 
force on the oscillator on the backdrop of a quantum uncertainty smaller that the standard quantum limit 
- see the first of Eqs. (5.174). 

In electrical engineering, this fact may be conveniently formulated in tenns of noise parameter 
0v of a linear amplifier - the instrument for continuous monitoring of an input “signal” - e.g., a 
microwave or optical waveform. 97 Namely, 0yv of “usual” (say, transistor or maser) amplifiers which are 
equally sensitive to both quadrature components of the signal, 0 a? has a minimum value tioj/2, due to the 
quantum uncertainty pertinent to the quantum state of the amplifier itself (which therefore plays the role 


93 The physical sense of these relations should be clear from Fig. 5.6: they define a system of coordinates rotating 
clockwise with angular velocity co, so that the point representing unperturbed classical oscillations with that 
frequency is at rest in that rotating frame. (The “probability cloud” presenting a Glauber state is also stationary in 
coordinates \x\, X 2 ].) The reader familiar with the classical theory oscillations may notice that xi and X 2 are 
essentially the RWA variables u and v, i.e. the Poincare plane coordinates - see, e.g., CM Sec. 4. 3-4. 6, and 
especially Fig. 4.9. 

94 See, e.g., P. Grangier et al., Nature 396 , 537 (1998), and references therein. This was, however, not the first 
QND implementation in optics - for a review see J. Roch et al., Appl. Phys. B 55 , 291 (1992). 

95 As it is, for example, for gravitational wave detectors - see the discussion and references in Sec. 2.10. 

96 See, e.g., CM Sec. 4.5. 

97 For the exact definition of the latter parameter, suitable for the quantum sensitivity range (0 V ~ tied) as well, 
see, e.g., I. Devyatov et al., J. Appl. Phys. 60 , 1808 (1986). In the classical noise limit (0 W » Tied), it coincides 
with ksT N , where T N is a more popular measure of electronics noise, called the noise temperature. 
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of its “quantum noise”). 98 On the other hand, a degenerate parametric amplifier, sensitive to just one 
quadrature component, may have 0# well below Ti col 2, due to the squeezing of its ground state." 

Finally, let me note that the parameter-modulation schemes of the QND measurements are not 
limited to harmonic oscillators, and may be applied to other important quantum systems, notably 
including two-level (i.e. spin-l/ 2 -like) systems. 100 


7.8. Exercise problems 

7.1 . Calculate the density matrix of a two-level system described by Hamiltonian with matrix 

H = a • a = a x G x + a y a y + a, c _ , 

where o; are the Pauli matrices, and a* are c-numbers, in thennodynamic equilibrium. 

7.2 . Find the Wigner function of a harmonic oscillator in: 

(i) at the thermodynamic equilibrium at temperature T, 

(ii) in the ground state, and 

(ii) in the Glauber state with dimensionless complex amplitude a. 

Discuss the relation between the first of the results and the Gibbs distribution. 

7.3 . Calculate the Wigner function of a harmonic oscillator, with mass m and frequency coo, in its 
first excited stationary state (n =1). 

7.4 . Show that the quantum-mechanical Golden Rule (6.1 1 1) and the master equation (196) give 
the same results for the rate of spontaneous quantum transitions n ’ — > « in a system with discrete energy 
spectrum, weakly coupled to a low-temperature the heat bath ( k^T « hco nn •). 

Hint : Try to establish a relation between function Im^f co nn ) that participates in Eq. (196), and 
the density of states p n that participates in the Golden Rule formulas, by considering a particular case of 
sinusoidal oscillations in the system of interest. 

7.5 . * A hannonic oscillator is weakly coupled to an Ohmic environment. 

(i) Use the rotating-wave approximation to write equations of motion for the Heisenberg 
operators of the complex amplitude of oscillations. 

(ii) Calculate the expectation values of the correlators of the fluctuation force operators, 
participating in these equations, and express them via the average number (n) of thermally-induced 
excitations in equilibrium, given by the second of Eqs. (26b). 

7.6 . For a harmonic oscillator with weak Ohmic dissipation: 

(i) Spell out the system of differential equations for the energy level occupancies W n . 


98 This fact was recognized very early - see, e.g., H. Haus and J. Mullen, Phys. Rev. 128 , 2407 (1962). 

99 See, e.g., the spectacular experiments by B. Yurke et al., Phys. Rev. Lett. 60, 764 (1988). 

100 See, e.g., D. Averin, Phys. Rev. Lett. 88, 207901 (2002). 
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(ii) Use this system to find the time evolution of the expectation value (E) of oscillator’s energy. 

(iii) Compare the last result with that following from the Heisenberg-Langevin approach. 

7.7 . Derive Eq. (209) in an alternative way, using an expression dual to Eq. (5.28b). 

7.8 . * A particle in a system of two coupled quantum wells (see, e.g., Fig. 4) is weakly coupled to 
an Ohmic environment. 

(i) Derive the equations of time evolution of the density matrix elements. 

(ii) Solve these equations in the low-temperature limit, when the energy level splitting is much 
larger than k^T, to calculate the time evolution of the probability Wi(t) of finding the particle one of the 
wells, after it had been placed there at t = 0. 

7.9 . * A spin -Vi particle is placed into magnetic field 3{t) = 3 y) + 3(t) with an arbitrary but small 

time-dependent component ( 3 « j3 0 1 ), and is also weakly coupled to dissipative environment. Derive 

the differential equations describing the time evolution of the expectation values (( S x ), etc.) of spin’s 
Cartesian components. 


Chapter 7 


Page 57 of 58 





Essential Graduate Physics 


QM: Quantum Mechanics 




Chapter 7 


Page 58 of 58 





Essential Graduate Physics 


QM: Quantum Mechanics 


Chapter 8. Multiparticle Systems 

This chapter is a brief introduction to quantum mechanics of systems of similar particles, with a special 
attention to the case when they are indistinguishable. For such systems, the theoiy predicts (and 
experiment confirms) very specific effects even in the case of negligible explicit (“direct”) interaction 
between the particles. The effects notably include the Bose-Einstein condensation of bosons, and the 
Pauli exclusion principle and exchange interaction for fermions. 


8.1. Distinguishable and indistinguishable particles 

The importance of quantum systems of many similar particles is probably self-evident; just the 
very fact of that most atoms include several/many electrons is sufficient to attract our attention. There 
are also important systems where the number of electrons is much higher than in one atom; for example, 
a cubic centimeter of a typical metal features -10“ conduction electrons that cannot be attributed to 
particular atoms, and have to considered as common (and interacting!) pats of the system as the whole. 
Though quantum mechanics offers virtually no exact analytical solutions for systems of strongly 
interacting particles, 1 it reveals very important new effects even in the simplest case when particles do 
not interact, and least explicitly {directly). 


Pure state 
of 2 

distinguish- 

able 

particles 


If non-interacting particles are either different from each other by their nature (say, an electron 
and a proton), or physically similar but still distinguishable because of other reasons (say, because of 
their reliable spatial separation) everything is simple - at least, conceptually. Then, as was already 
discussed in Sec. 6.7, a system of two particles, 1 and 2, each in a pure quantum state, may be described 
by a ket vector 




(8.1a) 


where the single-particle states (d and fd’ are defined in different Hilbert spaces. (Below, I will frequently 
use the following convenient shorthand, 


T=W). 


(8.1b) 


in which the state position within a vector codes the particle number.) Hence the permuted state 


'P\PP') S \P'P) = \P'\®\P) 2 , 


( 8 . 2 ) 


where P is the permutation operator, is clearly different from the initial one. 


1 An important conceptual question is why not treat one particle of such a collection as an open quantum system, 
and apply to it the powerful methods discussed in the last chapter, based on the separation of the whole Nature 
into the “system of our interest” and the “environment” - see Fig. 7.1. Such separation is very natural and works 
very well in cases when one, relative “massive” (inertial) particle, or a specific collective degree of freedom (also 
relatively inertial), is surrounded by a sea of “lighter particles”, which serve the role of an environment - 
frequently in or close to thermal equilibrium. On the other hand, in most systems of identical particles, such 
separation is more artificial and may lead to errors, because the quantum state of the “particle of interest” may be 
substantially correlated (in particular, entangled) with that of similar particles of its “environment” - see the 
discussion later in this section. 
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Again, such description is valid even for identical particles if they are still distinguishable by their 
spatial separation. (The separation does not preclude particles from interacting with each other, e.g., 
electrostatically.) Such systems of similar but clearly distinguishable particles (or subsystems) are 
broadly discussed nowadays, for example in the context of quantum computing and encryption - see 
Sec. 8.5 below. This is why it is unfortunate that tenn “identical particles” is frequently used in the 
sense of indistinguishable particles. I will try to avoid this confusion by using the latter term, despite it 
being rather unpleasant grammatically. 

Now comes the most important experimental fact : identical elementary particles, 2 if they are not 
reliably separated, are genuinely indistinguishable, i.e. their Hilbert spaces are not separable. Hence, 
instead of Eq. (1), for a set of two particles, we need to use a linear combination of products like | /?/?’) 
and | P’P) for the construction of genuine quantum states. 3 In order to comprehend what exactly linear 
combinations should be used, it is convenient to discuss properties of the permutation operator defined 
by the first of Eqs. (2). 

Let us consider an observable A, and a system of eigenstates of its operator: 

A\ a J 'j = A J \a J ^. (8.3) 

If the particles are indistinguishable indeed, the observable expectation value should not be affected by 
their permutation. Hence operators A and P have to commute, and share their eigenstates. This is why 
eigenstates of operator P are so important: in particular, they are also eigenstates of the Hamiltonian, 
i.e. the stationary states of the system of particles. 

Now let us have a look at the operation described by the square of the pennutation operator, on 
an elementary ket-vector product: 

P 2 \PP') = P{P\PP'))=P\P '/?) = | /?/?'), (8.4) 


2 Here by “elementary particles” I mean any of the following two options: 

(i) particles like electrons, which (at least at this stage of development of physics) are considered as 
structure-less entities; 

(ii) any object (e.g., a hadron or meson) which may be considered as a system of “more elementary” 
particles (e.g., quarks), but still may be reliably placed in a definite (say, ground) quantum state. 

From that point of view, even complex atoms or molecules of a certain chemical element, each in its 
ground state, may be considered on the same footing as elementary particles. 

3 A very legitimate question is why, in this situation, we need to introduce particle’s number to start with. A 
partial answer is that in this approach it is much simpler to derive (or guess) problem Hamiltonians from the 
correspondence principle. For example for a system of two spinless particles, each in an external potential U(r), 
and with the interaction energy T/j, u (| r i - r 2 |), the correct Hamiltonian is 

ff=^ + ^ + V(f,) + V(f 2 > + Mf,-f 2 |) 

2m 2 m 

Later in this chapter, we will discuss an alternative approach (the so-called “second quantization”) in which 
tracing a certain particle is avoided. While for indistinguishable particles this is more logical, in that approach 
writing adequate Hamiltonians (which, in particular, would avoid spurious self-interaction of the particles) is 
much more challenging - see Sec. 3 below. 
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••• • • 
i.e. P brings the state back to its original form. Since any pure state of a two-particle system may be 

represented as a linear combination of such products, this result does not depend on the state, and may 
be represented as an operator relation: 

P 2 =I. (8.5) 


Now let us find the possible eigenvalues Pj of the pennutation operator. Acting by both sides of 
Eq. (5) on any of eigenstates | aj) of the pennutation operator, we get a very simple equation for its 
eigenvalues: 

^ 2 = 1 , ( 8 - 6 ) 


with two possible solutions: 


P. =±1. 


(8.7) 


Let us find the eigenstates of the permutation operator in the simplest case when each of the 
component particles can be only in two single-particle states - say, (3 and /?’. Evidently, none of the 
simple products \(3(3’) and | /3’/3), taken alone, does qualify for the eigenstate - unless states /? and /?’ are 
identical. Let us try their linear combination 

|oTy ) = a \PP') + b\P'0), (8.8) 


so that 

p\ a J ) = P J \a J ) = a \P’P) + b\pp’). (8.9) 


For the case Pj = +1 we have to require states (8) and (9) to be the same, so that a = b. Assuming also 


that the single-particle states /? and /?’ are normalized, and requiring the same for the composite state a, 
we get the so-called symmetric eigenstate 4 


Symmetric 


a 

+ 

II 

fell — 

3 

+ 

p'p% 

and anti- 
symmetric Similarly, for Pj = - 1 we get a = - b, and the antisymmetric eigenstate 

entanaled 

eigenstates 


a 

I 

II 

feil- 

3 

i 

p’p% 


(8.10) 


( 8 . 11 ) 


where the front coefficients guarantee the orthonormality of the two-particle states, provided that the 
single-particle states are orthonormal. These are typical examples of entangled states, defined as multi- 
particle states whose state vectors cannot be factored into a product of single -particle vectors. 

So far, our math does not preclude either sign of Pj, in particular the possibility that the sign 
depends on the state (i.e. index /). Here, however, comes in another crucial experimental fact: all 
elementary particles fall into two groups: 4 5 


4 As in many situations we met before, kets (10) and (11) may be multiplied by exp { / ry} with an arbitrary real 
phase cp. However, until we discuss coherent supeipositions of various states a, there is no good motivation for 

taking the phase different from 0; that would only clutter the notation. 
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(i) bosons, particles with integer spin 5 , for which = +l for any j, and 

(ii) fermions, particles with half-integer spin, with f = -1, also for any j. 

In the nonrelativistic theory we are discussing now, this key fact should be considered as 
experimental one. (The relativistic quantum theory, to be discussed in Chapter 9, offers a proof that half- 
integer-spin particles cannot be bosons and integer-spin ones cannot be fermions, but not more than 
that.) However, our discussion of spin in Sec. 5.7 allows the following interpretation of the fermion- 
boson difference. In free space, the permutation of particles 1 and 2 may be viewed as a result of 
rotation of this pair by angle ±n about a certain axis. As we have seen in Sec. 5.7, at a rotation by such 
an angle, the state vector \ff of a particle with quantum number m s (that ranges from -s to +5 , and hence 
may take only integer values for integer s, and only half-integer values for half-integer 5 ) changes by 
factor exp {±i7m s }, so that the state product | /(/?’) changes by c\p {±i2mn,} , i.e. by factor +1 for integer 
s, and by factor (-1) for half-integer s. 

Since eigenvalues Pj do not depend on the particular state of the system, we can write explicit 
expressions for the permutation operator: 

- f + 1, for bosons, 

P = Ix\ (8.12) 

[- 1, for fermions. 

The most impressive corollaries of Eqs. (10) and (11) are for the case when the partial states of 
the two particles are the same: [3 = f3’. The corresponding Bose state a+ is possible; in particular, at 
sufficiently low temperatures, a set of non-interacting Bose particles condenses on the ground state of 
each of them - the so-called Bose-Einstein condensate (“BEC”). 6 Its examples include superfluid fluids 
like helium, the Cooper-pair condensate in superconductors, and the BEC of weakly interacting atoms. 
Perhaps the most fascinating feature of a multiparticle Bose-Einstein condensate is that dynamics of its 
observables is governed by laws of quantum mechanics, while (for nearly all purposes) may be treated 
as c-numbers - see, e.g., Eqs. (2.54)-(2.55). 7 

On the other hand, if we take /3 = (3’ in Eq. (1 1), we see that state a. vanishes, i.e. cannot exist at 
all. This is the mathematical expression of the Pauli exclusion principle : two indistinguishable fermions 
cannot be in the same quantum state. 8 (As will be discussed below, this is true for systems with more 
than two fermions as well.) Probably, the key importance of this principle is self-evident: if it was not 
valid for electrons (that are fennions), all electrons of each atom would condense on its ground (Is) 
level, and all the usual chemistry (and biochemistry, and biology, including dear us!) would not exist. 
The Pauli principle effectively makes fermions interacting even if they do not interact directly, in the 
usual sense of this word. 


5 Traditionally, people speak about two different “statistics”: the Bose-Einstein statistics of bosons, and Fermi- 
Dirac statistics of fermions, because their statistical distributions in thermal equilibrium are indeed different - see, 
e.g., SM Sec. 2.8. However, as evident from the above discussion, their difference is deeper, and actually we are 
dealing with two different quantum mechanics. 

6 For a quantitative discussion of the Bose-Einstein condensation see, e.g., SM Sec. 3.4. 

7 Such possibility follows from the fact that for the Bose-Einstein condensate of N » 1 particles, the Heisenberg 
uncertainty relation may be reduced to SNS(p> 1, where <p is the condensate wavefunction’s phase, so that it may 
have 5N/{N) « 1 and Sep « 1 simultaneously. 

8 It was formulated by W. Pauli in 1925, on the basis of less general rules suggested by G. Lewis (1916), I. 
Langmuir (1919), N. Bohr (1922), and E. Stoner (1924) for the explanation of experimental spectroscopic data. 
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8.2. Singlets, triplets, and the exchange interaction 

Now let us discuss possible approaches to analysis of identical particles on a simple but very 
important example of a pair of spin- 1 /^ particles (say, electrons) whose interaction with either each other 
or the external world does not involve spin. Then the ket-vector of a total state is factorable as 

|«_) = |oi2)®|si 2 ), (8.13) 


with the orbital function | on) and the spin function |.s- 1 2 ) (that depends on the state of both spins of the 
pair) belonging to different Hilbert spaces. It is frequently convenient to use the coordinate 
representation of such state, sometimes called the spinor. 

(8.14) 

Since spin- V 2 particles are fermions, the particle permutation, 

, r 2 )| s l2 } = H r 2 , r, )| s 21 ) = -y/(r, , r 2 )| j 12 ) , (8.1 5) 

has to change the sign of either the spin part or the orbital factor of the spinor. In the case of a 
symmetric orbital factor, 

y/(r 2 ,r l ) = y/(r l ,r 2 ), (8.16) 

the spin factor has to obey relation 

|^2i) = — l^iz)- (8.17) 


2-particle 

spinor 


( r i 5 r 2 

a_) = (r,,r 2 

On)® 

s n) = lH r i > r 2 ) K 2 ) - 


Let us use the ordinary z-basis (where z, in the absence of external magnetic field, is an arbitrary 
spatial axis) for each of the spins. In this basis, any ket-vector \m s ) of spin orientation of two particles 
may be represented as a linear combination of four single-spin basis vectors: 

|TT^, 1 44^, |T4^, and |4T^. (8.18) 


Singlet 

state 


The first two kets evidently do not satisfy Eq. (17), and cannot participate in the state. Applying to the 
remaining kets the same argumentation as has resulted in Eq. (1 1), we get 


s , 2 )=J=|tr)-|rt». 


(8.19) 


Such orbital-symmetric and spin-asymmetric state is called the singlet. The origin of this name 
becomes clear from the analysis of the opposite (orbital-asymmetric and spin-symmetric) case: 


¥(y 2 ,r x ) = -y/(Y l ,r 2 ), | j 12 ) = | s 2 i ). 


(8.20) 


Triplet 

states 


For the composition of such symmetric spin state, the first two kets of Eq. (18) are completely 
acceptable (with arbitrary weights), and so is a specific symmetric combination of two last kets, similar 
to Eq. (10): 



s n) = c + 

TT) + c_ 

U ) +c °^ 

14) + 

4T)) 

. 


(8.21) 
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We may use this composite state with any values of coefficients c (satisfying the nonnalization 
condition), because they correspond to the same orbital wavefunction and hence the same energy. 
However, each of these three states has a specific value of the z-component of the net spin (respectively, 
+h, -h, and 0). 9 Because of this, an even small external magnetic field lifts their degeneracy, splitting the 
energy level in three, and giving it the natural name of triplet. 

In the particular case when the particles do not interact at all, for example 10 

H = h l+ h 2 , h t =p- + U( r k ), * = 1,2, (8.22) 

2 m 

the 2-particle Schrodinger equation for the symmetrical orbital wavefunction (16) is obviously satisfied 
by the simple product, 

lK r i > r 2 ) = ¥ n (i) )w„' ( r 2 ), ( 8 - 23 ) 

of single-particle eigenfunctions, with arbitrary sets n, n ’ of quantum numbers. For the particular (but 
very important!) case n = n \ this means that the eigenenergy of the singlet state, 

^ Yn ( r i )Yn ( r 2 )(j T^) - 1 ■ it)), (8.24) 


is just 2 e„, where e„ is the single-particle energy level. It may be proved that the lowest energy of the 
triplet state is always higher than that. Hence, for the limited (but extremely important!) goal of finding 
ground-state energies of multi-electron systems, we may ignore the actual singlet structure of spinor 
(24), and reduce the Pauli exclusion principle to the semi-qualitative picture of single-particle levels, 
each “occupied” with 2 independent particles. 

As a very simple example, let us find the ground energy of a deep, cubic-shaped, 3D quantum 
well with side a, filled with 5 fennions, ignoring their direct interaction. From the solution of the single- 
particle Schrodinger equation in Sec. 1.5, we know the single -particle energy spectrum of the system: 


= b 0 { 


2 2 2 

n~ +ni + n: 


), with 


2*2 


£ 0 = 


Jl fl 

2 ma ’ 


and n x ,n v ,n= 1,2,... 


(8.25) 


so that the lowest-energy orbital states are: 

2 2 2 

- one ground state with {n x ,n y ,n z } = {1,1,1}, and energy sni= (1 +1 +1 )sq = 3so, and 

- three excited states, with {n x ,n y ,n z } equal to {2,1,1}, {1,2,1}, and {1,1,2}, with equal energies 
Bn 1 = B\ 21 =£’112 = (2~+ 1 2 + 1 2 )£o = 6£o- 

According to the Pauli principle, each of these energy levels states can accommodate up to 2 
electrons. Hence the lowest-energy (ground) state of the 5-electron system is achieved by placing 2 of 


9 Note that in the sense of Eq. (5.197), all three triplet states of a two-electron system behave as a single integer 
spin with 5=1; for example, S 2 equals 2/f , rather than 0 as one could expect for the last component of Eq. (21) - 
see Problem 1 . 

10 In this chapter, I try to use lower-case letters for observables of single particles (in particular, s for their 
energies), in order to distinguish them as clearly as possible from system’s variables, including the total energy E 
of the system, typeset in capital letters. 
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them on the ground level £\ \ \ = 3so, and the remaining 3 particles, in any of the degenerate “excited” 
states of energy 6so . Hence the ground energy of the system is 

12 7T 2 ti 2 

E =2x3g 0 +3x6g 0 =24g 0 = , . (8.26) 

ma- 
in many cases of relatively weak interaction between particles, it does not blow up such a simple 
quantum state classification scheme, and the Pauli principle allows tracing the order of single-particle 
state filling with Fermi particles. This is exactly the approach that has been used at our discussion of 
atoms in Sec. 3.7. 


Now let us describe the results of particle interaction more quantitatively, on the simplest 
example 11 of the lowest energy states of a neutral atom 12 of helium - more exactly, helium-4, usually 
denoted 4 He, consisting of a nucleus with two protons and two neutrons, of the total electric charge q = 
+2e, and two electrons “rotating” about it. Neglecting the small relativistic effects that was discussed in 
Sec. 6.3, the Hamiltonian describing the electron motion may be represented as 


H = h i + h 2 + u m 


2e 2 


* 2 

2m 4 7i£ n r, 


O' k 


4 7T£ 0 r, - r. 


(8.27) 


As most problems of multiparticle quantum mechanics, the eigenvalue/eigenstate problem for 
this Hamiltonian does not have an exact analytical solution, so let us start an approximate analysis 
considering the electron-electron interaction as a perturbation. As was discussed in Chapter 6, we have 
to start with the “O th -order” approximation in which the perturbation is ignored, so that the Hamiltonian 
is reduced to sum (22). In this approximation, the ground state g of the atom is the singlet (24), with the 
orbital factor 


V g ( r i > r 2 ) = ioo (*i )y/ m (r 2 ) , 


(8.28) 


and energy 2 s g . Here each operand i//i oo( r) is the single-particle wavefunction of the ground (l.s) state of 
the hydrogen-like atom with Z = 2, with quantum numbers n = 1 , / = 0, m = 0. According to Eqs. (3 . 1 74) 
and (3.198), 




2 -r h 


VTtt q , 3 ' 2 


with r n = — = — , 


so that according to Eq. (3.191), in this approximation the total ground state energy is 


; <0) =2s w = 


f £ 0 'l 

- 2 

( 

z 2 e h ] 

l 2 n- J 

n=l,Z=2 

V 

2 J 


= -4E h « -109 eV. 


(8.29) 


(8.30) 


J Z = 2 


This is still somewhat far (though not terribly far!) from the experimental value E g « -78.8 eV - see the 
bottom level in Fig. la. 


11 It is also very important, since helium makes up more than 20% of all “ordinary” matter of our Universe. 

12 Evidently, the positive ion He +1 of such atom, with just one electron, is very well described by the hydrogen- 
like atom theory with Z = 2, whose ground-state energy, according to Eq. (3.191), is -Z 2 E \ r /2 = - 2 E H x - 55.4 eV. 
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Making a small detour from our main topic, electron indistinguishability effects, let us note that 
we can get a much better agreement with experiment by calculating the electron interaction energy in 
the 1 st order of the perturbation theory. Indeed, in application to our system, Eq. (6.13) reads 

E { g = {g pint \g) = \d\\d\ i//* (r, , r 2 )u int (r, ,r 2 )y/ g (r, ,r 2 ). (8.31) 


Plugging in Eqs. (27)-(29), we get 



4^ r, -r 2 


expl 


2(q +r 2 ) ' 
r o 


(8.32) 


As may be readily evaluated analytically (this exercise is left for the reader), this expression equals 
(5/4 )fs H , so that the corrected ground state energy, 


F (0) 


+ E" = 


-4 + 5 - 
4 j 


E h =-74.8eV, 


(8.33) 


is much closer to experiment. 
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Fig. 8.1. The lowest energy levels of a helium atom: (a) experimental data and (b) a schematic structure 
of an excited state with fixed n and / in the first order of the perturbation theory. On panel (a), all 
energies are referred to that (-2E a ~ -55.4 eV) of the ground state of ion Fle +1 , so that their magnitudes 
are the (readily measurable) energies of atom’s ionization starting from the corresponding bound state. 

There is still a room for improvement - that may be made, for example, using the variational 
method, 13 based on the following, very general observation. Let n be the exact, full and orthonormal set 
of stationary states of a quantum system, and use it as the basis for expansion of a normalized but 
otherwise arbitrary trial state a (defined in the same Hilbert space): 

|«) = IXI")’ ( 8 - 34 ) 

n 


13 See also Problems 2. 6-2. 8, 2.34, and 3.3. 
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with the energy that may be calculated using the general (4. 125): 

E a = {a\H\a) = 'Y J W n E n , where W n = \a n \ > 0. (8.35) 

n 


Since, by definition, the exact ground state energy E g is the lowest one of the set E„, we can use Eq. (35) 
to compose the following inequality: 

Variational 
method’s 
justification 

Thus, the ground state energy is always lower then (or equal to) the energy of any trial state a. Hence, if 
we make several attempts with reasonably selected trial states, we may expect the lowest of the results 
to approximate the genuine ground state energy reasonably well. 

For our particular case of a 4 He atom, we may try to use, as the trial state, the wavefunction 
given by Eqs. (28)-(29), but with the atomic number Z considered as an adjustable parameter Z e f < Z = 2 
rather than a fixed number. The physics behind this idea is that each the electric charge density p(r ) = - 
e\i//(Y)\ of each electron forms a negatively charged “cloud” that reduces the effective charge of the 
nuclei, as seen by another electron, to Z e & , with some Z e f < 2. As a result, the single-particle 
wavefunction spreads further in space (r 0 = r B /Z e f > r B /Z), while keeping its functional form (29) nearly 
intact. Since the kinetic energies T in system’s Hamiltonian are proportional to rd , while the potential 
energies scale as r 0 '\ we can write 


E >YW E =E YW =E 

a n g g ^ n g 


(8.36) 


E g (Z e f ) = 



\2 

) 




(8.37) 


Now we can use the fact that according to Eq. (3.202), for any stationary state of a hydrogen-like atom 
(just as for the classical circular motion in the Coulomb potential), ( U) = 2 E, and hence (7) = E - (U) = - 
E. Using Eq. (8.30), and adding the correction LlJ 1 1 = -(5/4)£n calculated above, to the potential energy, 
we get 


EAZ e f) = 


ef 

2 


+ 


V 


-s+fi 

4j 


“'ef 


(8.38) 


The minimum of function E g (Z e f) and the corresponding “optimal” value of Z e f are as follows: 

f a \ 


(Z ef ) opt =2 


1 - — 

32 


= 1.6875, (A ) 


8 /min 


-2.85£ h *-77.5 eV . 


(8.39) 


Given the trial function crudeness, this number is in a surprisingly good agreement with experimental 
value cited above, with a difference of the order of 1%. 14 

Now let us return of our basic topic - the effects of electron indistinguishability. As we have just 
seen, the ground level energy of the helium atom is not affected directly by this fact, but the situation is 


14 This example explains why the variational method is broadly used for approximate treatment of complex 
quantum systems, although it is based more or less intuitive guesses of trial functions, i.e. in contrast with the 
perturbation theories discussed in Chapters 6 and 7, does not guarantee asymptotically correct results in any 
particular limit, unless such correctness is manually incorporated into the trial state choice. 
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different for its excited states - even the lowest ones. The reasonably good convergence of the 
perturbation theory, that we have seen for the ground state, tells us that we can base our analysis of 
wavefunctions (y/ e ) of the lowest excited state orbitals, on products like y/\ oo(r/ ; ) (//„/,„( iq), with n > 1. 
However, in order to satisfy the fermion permutation rule, 7) = -1, we have to take the orbital part of the 
state in an either symmetric or asymmetric fonn: 


W e ( r i > r : 




100 ( r 2 )1 


(8.40) 


with the proper total permutation asymmetry provided by the corresponding spin part given by, 
respectively, Eq. (19) or Eq. (21), so that the upper/lower signs in Eq. (40) correspond to the 
singlet/triplet spin state. Let us calculate the expectation values of the total energy of the system in the 
first order of the perturbation theory. Plugging Eq. (40) into the 0 th order expression 


E e ) {0) =|r/V 1 |r/V 2 y/* (r 1 ,r 2 )(// 1 +H 2 )i// e (r,,r 2 ) : 


(8.41) 


we get two groups of similar terms that differ only by the particle index. We can merge the terms of 
each pair by changing the notation as (iq — > r, r? — » r ’ ) in one of them, and (iq — > r ’, r2 — > r) in the 
other term. Using Eq. (27), and the mutual orthogonality of wavefunctions ^ioo(r) and y/«/ ffl (r), we get 
the following result, 


( 0 ) 


= jVi*oo (r) 


rv r 

2 m 


2e 2 


Ajts.r j 


Vm (x)d 3 r + 1 y/* hn (r ') 


Ti v r , 
2m 


2e 2 


4 7ts a r' 


V nmi ( r ')d 3 r' 


(8.42) 


- ^100 + 8 


nml ? 


which may be interpreted as the sum of eigenenergies of two separate single particles, one in the ground 
state 100, and another in the excited state nlm - despite that actually the electron states are entangled. 
Thus, in the 0 th order of the perturbation theory, the electron entanglement does not affect their energy. 

However, the potential energy of the system also includes the interaction tenn U- m t (27) that does 
not allow such separation. As a result, in the first approximation of the perturbation theory, the total 
energy of the system may be represented as 

E e= 8 m+ 8 n, m +E M, (8.43a) 


E M={U mt ) = jd 3 r l jd 3 r 2 '/A (r l , r 2 )f/ ml (r 1 ,r 2 )(//, (r 15 r 2 ) (8.43b) 

Plugging Eq. (40) into this result, using the symmetry of u mt with respect to the particle number 
permutation, and the same particle coordinate re-numbering as above, we get 

E":=E l]r ±E a , (8.44) 

with deceivingly similar expressions for the operands: 

(8.45a) 
(8.45b) 


E d, = J d 2 r'y/ m (r)y/ nlm (r')u M (r,r')i// m (r)y/ nlm (r ') , 
E ex = j d'r\ d 3 r '(//* 00 (r)y/* nlm (r ')u int (r , r ')y/ nlm (r)^ 100 (r ') . 


Orbital 
functions of 
orthohelium 
and 

parahelium 


Exchange 

interaction 

energy 
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Since the single-particle orbitals can be always made real, both components are positive (or at 
least non-negative). However, their physics is completely different. Integral (45a), called the direct 
electron-electron interaction, allows a simple semi-classical interpretation as the Coulomb energy of 
interacting electrons, each distributed in space with the electric charge density p nm { r) = - 
ei// nm i*(r)i// n mi(r): 15 

„ = ld 3 rjd’r' = f p m (r)md'r, (8.46) 

J J 4n£ 0 \r-r\ J 

where (/Xr) is the electrostatic potential created, at point r, by the counterpart electron’s “electric charge 
cloud”: 16 

(j){r) = -^—\d i r' P . nhn{ ' r ^ . (8.47) 

4 7T£ 0 J r -r'| 

However, integral (45b), called the exchange interaction, evades a classical interpretation, and 
(as it is clear from its derivation) is the direct corollary of the electron indistinguishability. The 
magnitude of E ex is also very much different from E^t, because the function under integral (45b) 
disappears in those regions where single-particle wavefunctions do not overlap. This is in a full 
agreement with the discussion in Sec. 1: if two particles are identical but well separated, i.e. their 
wavefunctions do not overlap, the exchange interaction disappears, because all effects of particle 
indistinguishability vanish. 

Historically, the fact of having two different hydrogen-like spectra (48) and (49) was taken as an 
evidence for two different species of 4 He, called, respectively, the orthohelium and parahelium. Figure 
lb shows the structure of an excited energy level, with certain quantum numbers n > 1, /, and m, given 
by Eqs. (44)-(45). The upper level, with energy 

ortho = (*100 + £ nlm ) ^dir + ^ ex > *100 + * nlm ’ (8.48) 

corresponds to the “orthohelium”, i.e. the symmetric orbital state and hence to the singlet spin state (19), 
with zero net spin, 5 = 0. The lower level, with 

^para = (*100 + £ nlm ) + ^dir " ^ex < ^ortho > (8-49) 

corresponds to “parahelium”, i.e. the antisymmetric orbital, and hence to the triplet spin state(s) with s = 
1 - see Eq. (21). Its degeneracy may be lifted by magnetic field, so that the splitting is identical to that 
of an elementary particle with spin 5=1. Calculations of the direct and exchange interaction integrals 
(45) for various values of n and / show that the perturbation theory explains the experimental spectrum 
of the orthohelium and parahelium (Fig. 1) pretty well. 

Encouraged by this success, and motivation by the very important task of description of atoms, 
molecules, and metals, we may try to apply the same approach to systems with N> 2 electrons. In this 
case the mathematical expression of the Pauli principle for fermions is 


15 See, e.g., EM Sec. 1.3, in particular Eq. (1.54). 

16 Note that the result for E du correctly reflects the basic fact that a charged particle does not interacts with itself, 
even if its wavefunction is quantum-mechanically spread over a finite space volume. Unfortunately, this is not 
true for some other approximate theories of multi-particle systems - see Sec. 4 below. 
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'P kk , =-l, for a\\k,k' = 


(8.50) 


where operator P w permutes particle with numbers k and k In order to understand how common 

eigenstates of all such operators may be formed, let us return for a minute to two non-interacting 
electrons, and rewrite Eq. (1 1) in the following compact form: 

state 1 state 2 


a 




1 


\ 0 )® 

I 0 ]® 


i 

\ft 

\fi 


<— particle number 1 , 
<— particle number 2. 


(8.51) 


In this way, the Pauli principle is mapped on the well-known property of matrix determinants: if any of 
two columns of a matrix coincide, its determinant vanishes. This Slater determinant approach may be 
readily generalized to TV fermions in TV (not necessarily lowest) single-particle states J3, /?’, etc: 



state list — » 



0 ) ® |yff')<8> | J3")® 

0 )® |A')® j/?")® 

0)® |A')® \j8")® 


TV 


particle 


list 

I 


(8.52) 


Slater 

determinant 


Even though the Slater determinant form is extremely nice and compact (in comparison with 
direct writing of a sum of TV! products, each of TVket factors), there are two major problems with using it 
for practical calculations: 

(i) For the calculation of any bra-ket product (say, within the perturbation theory) we need to 
spell out each bra- and ket-vector as a sum of component terms. Even for a limited number of electrons 
(say N ~ 10 2 in a typical atom), the number TV! ~ 10 160 of terms in such a sum is impracticably large for 
any analytical calculation. 

(ii) In the case of interacting fermions, Slater determinants do not describe the eigenvectors of 
the system; rather the stationary state is a superposition of such determinants - each for a specific 
selection of TV states from the general set of single-particle states - that is generally different from TV. 

These challenges make the development of a more general theory that would not use particle 
numbers (which are superficial for indistinguishable particles to start with) a must for getting any final 
results for multiparticle systems. 


8.3. Second quantization 

The most useful formalism for this purpose, that avoids particle numbering at all, is called the 
second quantization. 11 Actually, we have already discussed a particular version of this formalism, for 


17 It was invented (first for photons and then for arbitrary bosons) by P. Dirac in 1927, and then modified in 1928 
for fermions by E. Wigner and P. Jordan. The term “second quantization” is rather misleading for the 
nonrelativistic applications we are discussing, but finds certain justification in the quantum field theory. 
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Dirac 

state 


Boson 

annihilation 

operator 


the a case of ID harmonic oscillator’s excitations, in Sec. 5.4. As a reminder, we have used Eqs. (5.98) 
to define the “creation” and “annihilation” operators via the usual operators of coordinate and 
momentum, and then proved their key property (5. 122), 

a^\n) = (n + l) 1/2 |« + l), a\n) = « 1/2 |n-l), (8.53) 


where n are the stationary (Fock) states of the oscillator. This property allows an interpretation of 
operators’ actions as the creation/annihilation of a single excitations of energy h op - thus justifying the 
operator names. In the next chapter, we will show that such an excitation of an electromagnetic field 
mode may be considered as a massless boson with s = 1, called the photon. 


In order to generalize this approach to arbitrary bosons, not appealing to a specific system such 
as the harmonic oscillator, we may use relations similar to Eq. (53) to define the creation and 
annihilation operators. The definition looks simple in the language of the so-called Dirac states, with 
ket-vectors 


N 1 ,N 2 ,...N i 


(8.54) 


where Nj are the state occupancies, i.e. the numbers of bosons in each single-particle state j. Let me 
emphasize that here indices 1,2, ...j,..., are the positions of each number in the Dirac ket vector, i.e. are 
the numbers of single-particle states (including their spin parts) rather than particles. Thus the very 
notion of individual particle numbers is completely (and for indistinguishable particles, very relevantly) 
absent from this formalism. Generally, the set of single-particle states participating in the Dirac state 
may be selected in an arbitrary way (provided that it is full and orthonormal), 


N l ,N 2 ...,N j „ 


N v N 2 ...,N i 


- S N x N\ S N 2 N' 2 - S 


NjN'r 


(8.55) 


though for system of non- (or weakly) interacting bosons, using the stationary states of individual 
particles in the system under analysis are almost always the best choice. 


Now we can define the particle annihilation operator as follows: 


a J \N l ,N 2 ,...N J ,..) = N) l2 \N l ,N 2 ,...N J 



(8.56) 


Note that the pre-ket coefficient, similar to that in Eq. (53), guarantees that an attempt to annihilate a 
particle in an unpopulated state gives the non-existing (null) state: 


d J \N l ,N 2 ,...0 J ,..) = 0, 

where symbol 0, means zero occupancy of y-th state. An alternative way to write Eq. (56) is 
(Ai,A 2 ,...,AE,...| dj | JVj, N 2 ,.., Nj,..}j = n! S n ^ n ,5 n ^ n ,^... S n ,_ n _ v .. 
According to Eq. (4.65), the matrix element of the Hermitian conjugate operator a j is 


(8.57) 


(8.58) 
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N v N 2 ,...,Nj,...\ aJ\N 1 ,N 2 ,...N J .,..\ = (n i ,N 2 ,...,N j .,...\ a } \n\,N' 2 ,...,N' j,.„ 


-(N 1 ,N 2 ,...,N j ,...^Nj) \n i ,N 2 ,...,N j 1,.») - {n j) dN l N\^N 2 N' 2 -^N i ,N , -l- ( 8 - 59 ) 

= (Nj + l) - S Nj+\,N' 


meaning that 


= (n j +lJ /2 \N l ,N 2 ,...,Nj + 1 ,...), 


(8.60) 


in the total compliance with the first of Eqs. (53). In particular, this particle creation operator a] allows 


(8.61) 


the description of the generation of a single particle from the vacuum (not null!) state |0,0,.„) : 

if 


a) 0,0,...,0 y ,...,0> = 0,0,...,l y .,...0 


and hence a product of such operators may create, from the vacuum, a multiparticle state with an 
arbitrary set of occupancies: 18 


a}. . .a} a \d\ . . .d\. . . | 0, 0, . . .) = {N x ! N 2 ! . . .) 1/2 1 N x , N 2 , . . .). 


(8.62) 


N t times N 2 times 

Next, combining Eqs. (56) and (60), we get 

a]a J \N l ,N 2 ,...N J ,..) = N j \N l ,N 2 ,...,N j ,...), 
so that, just as for the particular case of harmonic oscillator excitations, operator 


Nj = aj a j 


(8.63) 


(8.64) 


conserves the numbers of particles in all single-particle states, and simultaneously “counts” their number 
in the y-th state. Acting by the creation-annihilation operators in the reverse order, we get 


a t a ; 


j|A 1 ,A 2 ,...,A 7 ,...) = (A^.+l)|A 1 ,A 2 ,...,A 7 ,...). 


(8.65) 


This result shows that for any state of a multiparticle system (which always may be represented as a 
linear superposition of Dirac states with different sets of NJ), we can write 


- -t 

a a — a a = 

j j j j 


f 


a,, a) 


= /, 


( 8 . 66 ) 


again in agreement with what we had for the ID oscillator - cf. Eq. (5.101). According to Eq. (55), the 
creation and annihilation operators corresponding to different single-particle states do commute, so that 
Eq. (66) may be generalized as 


18 The resulting Dirac state is not an eigenstate of every multiparticle Hamiltonian. However, we will see below 
that for a set of non-interacting particles it is an eigenstate, and thus may be used in the basis for perturbation 
theories of systems of weakly interacting particles. 


Boson 

creation 

operator 


Number- 

counting 
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a . , a , 

II 


j ’ j 

JJ " 


(8.67) 


and that similar bosonic creation and annihilation operators commute, regardless of which states do they 
act upon: 







a) ,a ), 

= 

fly, fly 

= 0. 


( 8 . 68 ) 


Relations (66)-(68) are the mathematical expression of the independence of occupancies of different 
boson states. 


As was mentioned earlier, a major challenge in the Dirac approach is to rewrite the Hamiltonian 
of a multiparticle system, that naturally carries particle numbers k (see, e.g., Eq. (22) for k= 1, 2), in the 
second quantization language, in which there are no these numbers. Let us start with single-particle 
components of such Hamiltonians, i.e. operators of the type 


Single- 

particle 

operator 


N 


k = 1 


(8.69) 


where all N operators f k are similar, besides that each of them acts on one specific (/c-th) particle, and N 

is the total number of particles in the system, that is naturally equal to the sum of single-particle state 
occupancies: 




(8.70) 


The most important examples of such operators are the kinetic energy of N similar single particles, and 
their potential energy in an external field: 





<>= 2 >( r <-)- 


k = 1 


(8.71) 


In order to express a particle-separable operator of the type (69) in terms of the Dirac formalism, 
we need to return for a minute to the particle-number representations used in the beginning of this 
chapter. Instead of the Slater determinant (52), for bosons we have to write a similar expression, but 
without the sign changes (sometimes called the permanent ): 




’ A 1 !...A / !...^ 

Nl , 


1/2 


I 


wr. 

N operands 


(8.72) 


Note again that the left-hand part of this relation is written in the Dirac notation (that does not 
use particle numbering), while in its right-hand part, just in relations of Secs. 1-2, particle numbers are 
coded with the positions of the single-particle states inside the ket-vectors, and the sum is over all 
different permutations of the states in the ket - cf. Eq. (10). (According to the elementary 
combinatorics, 19 there are N\/(Ni\...Nj\...) such permutations, so that the coefficient before the sum 
ensures the proper normalization of the single-particle states.) Let us use Eq. (72) to spell out the 
following bra-ket of a system with (N - 1 ) particles: 


19 See, e.g., MA Eq. (2.3). 
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-1,...|f|...TV / 

(AT — 1)! v 7 


TV- 1 


P(N-\\ P|JV-1) 


(8.73) 


k = 1 


where all non-specified occupation numbers in the corresponding positions of the bra- and ket-vectors 
are equal to each other. Each single-particle operator f k , participating in the operator sum, acts on the 

bra- and ket-vectors of states j and j respectively, in a certain (say, k th ) position, giving the result that 
does not depend on the position number: 



m k th position 



in A' th position 


(P,\f\Pr) s fir- 


(8.74) 


Since in both permutation sets participating in Eq. (73), with (TV - 1) vectors each, all positions are 
equivalent, we can fix the position (say, take the first one) and replace the sum over k by the 
multiplication by factor (TV - 1). The fraction of permutations with the necessary bra-vector (with 
number j ) in that position is NJ{N - 1), while that with the necessary ket-vector (with number j ’) in the 
same position in TV/7(TV - 1). As the result, the permutation sum in Eq. (73) reduces to 


TV,. TV 
(TV - 1) — — 

TV-1 TV - 


'-ft- X 

P(N-2\P\N-2) 


(8.75) 


where our specific position k is now excluded from both the bra- and ket-vector permutations. Each of 
these permutations now includes only (A) - 1) states j and (TV,- - 1) states j’, so that, using the state 
orthonormality, we finally arrive at a very simple result: 


(...N j, -1,...|f|...TV / . -1,...TV,,,...) 
TV,!. ..(TV, -1)!...(TV,, -1)!..., 


N N; 




(TV- 2)! 


(TV-1)! 


TV-1 TV-1 TV,!...(TV — 1)!...(TV , — 1)! ... 


(8.76) 


Now let us calculate matrix elements of the following operator: 


T,fr 




a j a ,, . 


jj 


A direct application of Eqs. (56) and (60) shows that the only nonvanishing of them are 


(8.77) 


(8.78) 


But this is exactly the last form of Eq. (76), so that in the basis of Dirac states, operator (69) may be 
represented as singie- 

particle 

(8-79) ST 

quantization 
form 

This beautifully simple equation is the most important formula of the second quantization theory, 
and is essentially the Dirac-language analog of Eq. (4.59) of the single-particle quantum mechanics. 

Each term of the sum may be described by a very simple mnemonic rule: if an operator “connects” two 
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single-particle states j and j ’, move the particle from state j ’ into state j, and weigh the result with the 
corresponding single-particle matrix element. (One of the corollaries of Eq. (79) is that the expectation 
value of an operator whose eigenstates coincide with the Dirac states, is 



F\...N, 




(8.80) 


with an evident physical interpretation as the sum of single-particle expectation values over all states, 
weighed by state occupancies.) 

Proceeding to fermions, which have to obey the Pauli principle, we immediately notice that any 
occupation number Nj may only take two values, 0 or 1 . In order to account for that, and also make the 
key equation (76) valid for fennions as well, the creation- annihilation operators are now defined by 
relations 


Fermion 

creation- 

annihilation 

operators 


In these relations, symbol X(J, J’) means the sum of all occupancy numbers in state positions from J to 
J\ including the border points: 


a 7 |iV 1 ,iV 2 ,...,0 y ,...) = 0, a y |jV 1 ,JV 2 ,...,l y ,...) = (-l) Z(U ^.^.....O .,...), (8.81) 

aJ|^,iV 2 ,...,0 y ,...) = (— l) s(1, - / — 1} | TVj , 7V 2 ,...,l y ,...), aJ|7V 1? iV 2 ,...,l 7 .,...} = 0. (8.82) 


j=J 


(8.83) 


so that the sum participating in Eqs. (81) and (82) is the total occupancy of all states with the numbers 
below j. (The states have to be numbered in a fixed albeit arbitrary order.) As a result, Eqs. (81)-(82) 
may be readily summarized in the verbal form: if an operator replaces the j th state occupancy with the 
opposite one (1 with 0, or vice versa), it also changes sign before the result if (and only if) the total 
number of particles in states with j’<j is odd. 

One of corollaries of this (somewhat counter-intuitive) rule of sign alternation is that the sign of 
the ket-vector of a completely filled two-state system depends on how exactly it has been formed from 
the vacuum state. Indeed, if we start from creating the fermion in state 1, we get 

4 1 0, 0) = (-1)° 1 1, 0) = 1 1, 0), a[ a \ | 0, 0) = a [ | 1, 0) = (-1) 1 1 1, l) = -| 1, l), (8.84) 

while if the operator order is different, the result’s sign is opposite: 

4 1 0,0) = (— 1)° 1 0, l) = 1 0,1), 44" I 0 ’ 0 ) = 4 1 0,1) = (— 1)° 1 1, l) = +|1,1). (8.85) 


Since the action of any of these operator products on any initial state rather than vacuum gives the null 
ket, we can write the following operator equality: 


t _ I 1 _ 


a{ + a\ a{ = \ a{ ,a 2 


0 . 


( 8 . 86 ) 


It is straightforward to check that this result is valid for the Dirac vector of an arbitrary length, and does 
not depend on the occupancy of other states, so that we can always write 
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(8.87) 


these equalities hold for j = j ’ as well. On the other hand, the absolutely similar calculation shows that 
the mixed creation-annihilation operator products do depend on whether the states are different or not: 20 



( 8 . 88 ) 


Commutation 
relations for 
fermionic 
operators 


These equations look very much like Eqs. (67)-(68) for bosons, “only” with the replacement of 
commutators with anticommutators. Since the core laws of quantum mechanics, including the operator 
compatibility (Sec. 4.5) and the Heisenberg equation (4.199) of operator evolution in time, involve 
commutators rather than anticommutators, so that one might think that all the behavior of bosonic and 
fermionic multiparticle systems should be dramatically different. However, the difference is not as huge 
as one could expect, for one, a straightforward check shows that the sign factors in Eqs. (81)-(82) 
compensate those in the Slater detenninant, and make the key relation (79) valid for the fermions as 
well. (Indeed, this is the very goal of the introduction of these factors.) 


As the simplest example, let us examine what does the second quantization formalism say about 
dynamics of non-interacting particles in the system whose single-particle properties we know well, 
namely two nearly-similar, coupled quantum wells - see Fig. 2.23. If the coupling (tunneling) between 
the wells is so small that the states localized in the wells are only weakly perturbed, in the basis of these 
states, the single-particle Hamiltonian of the system may be represented by 2x2 matrix (6.27). Selecting 
the origin of energy at the middle between energies of unperturbed states, so that coefficient ao in Eq. 
(6.27) vanishes, we can reduce the matrix to 


h = a • o = 


\ a + ~ a ,j 


a ± = a x ±ia y , 


(8.89) 


with eigenvalues 


s ± = ±a, a = a 


2 , 2 . 2 

K + a y + a z 


1/2 


(8.90) 


Now following recipe (79), we can represent the Hamiltonian of the whole system of particles in terms 
of the creation-annihilation operators: 


H 


= a z a} a x + a_aj a 2 + a + d 2 a x -a : d 2 d 2 , 


(8.91) 


where a} 2 and a x 2 are the operators of creation and annihilation of a particle localized in the 

corresponding quantum well. According to Eq. (64), the first and the last terms of the right-hand part of 
Eq. (91) describe particle energies in uncoupled wells, 


A ' v 

a.a{ a, = £ 1 N l , —a z 


— £ 2 N 2 , 


(8.92) 


20 A by-product of this calculation is a proof that operator (57) counts the number of particles Nj (now equal to 
either 1 or 0), just at it does for bosons. 
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while the sum of middle two terms is the second-quantization description of tunneling between the 
wells. 


Now we can use the general Eq. (4.199) of the Heisenberg picture to find the equations of 
motion for the creation-annihilation operators. For example, 

r a i r a i 

(8.93) 

Since the Bose and Fenni operators satisfy different commutation relations, one could expect the right 
hand part of this equation would be different for bosons and fermions. However, it is not so. Indeed, all 
commutators in the right-hand part of Eq. (93) have the following fonn: 


d„H 

= a z 

4— _H 
1 1 

+ a_ 

A /v i* /v 

d l9 di a 2 

+ a + 

4^ <N 
'O' 

1 1 


1 1 

<N 

<N 

<^T 

l l 


a j,alaj„ 


= ajdj.a -Qj.ajMj. 


(8.94) 


According to Eqs. (67) and (88), the first pair product of the operators may be recast as 

a ; aj, = 18 jj , ±aj,a y , (8.95) 

where the upper sign pertains to bosons and the lower to fermions, while according to Eqs. (68) and 
(87), the very last pair product is 

a j „d j =±d j d j „, (8.96) 

with the same sign convention. Plugging these expressions into Eq. (94), we see that regardless of the 
particle statistics, two last terms cancel, and we arrive at a universal (and generally very useful) 
commutation rule 



a' r a j „ 


= d r S jf , 


(8.97) 


valid for particles of both kinds. As a result, the Heisenberg equation of motion for operator d x , and the 
equation for a 2 (that may be obtained absolutely similarly), are also statistics-independent: 21 


iha l = a,a x + a_d 2 , 
iha 2 = a + d 2 -a z a 2 . 


(8.98) 


Thus we have got a system of coupled, linear differential equations that are identical to 
equations for the c-number probability amplitudes of single-particle wavefunctions of a two-level 
system - see Eq. (2.201) and Problem 4.10. Their general solution is a linear superposition of 
exponents: 

(o = 2X? ex p{44 ( 8 -") 


/v i* . 

21 Equations of motion for creation operators a 12 are just the Hermitian-conjugates of Eqs. (98), and do not add 
any new information about system’s dynamics. 
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As usual, in order to find exponents A± , it is sufficient to plug in a particular solution 
a l2 (t) = Cj 2 cxp{/L/ } into Eq. (98) and require that the determinant of the resulting homogeneous, linear 

system for “coefficients” (actually, time-independent operators) c X2 equals zero. This gives us the 
following characteristic equation 


a z - ifiX 
a + 


a_ 

- a . - itiX 


= 0 , 


( 8 . 100 ) 


with two roots A± = ±iQJ 2, where Q = 2a/ h. Now plugging each of the roots, one by one, into the system 
of equations for c l2 , we can find these operators, and hence the general solution of system (98) for 
arbitrary initial conditions. 

Let us consider the simple case a y = a z = 0 (meaning in particular that the well eigenenergies are 
exactly aligned), so that Ml'2 = a = a x ; then the solution of Eq. (98) is 


, , . , Qt ., . Qt „ . Q t „ Clt 

a x (t) = a,(0)cos — -za 2 (0)sin— , a 2 (t) = -ia 1 (0)sin — + a 2 (0)cos — . 


( 8 . 101 ) 


Multiplying the first of Eqs. (101) by its Hermitian conjugate, and ensemble-averaging the result, we get 



a} {t)a x (t)\ = (a} (O)^ (0)\ cos 2 ^ + ^a 2 (0)a 2 (0)^ sin° 
i(a } (0)a 2 (0) + a\ (0)«! (0)^ sin cos ^ . 


Clt 

~2 


( 8 . 102 ) 


Quantum 

oscillations: 

second 

quantization 

form 


Let us consider the particular case when the initial state of the system is a Dirac state, i.e. has a 
definite number of particles in each well; in this case only two first terms in the right hand part are 
different from zero: 22 


(N x ) = N x (0) cos 2 


— + A 2 (0)sin 2 


Clt 

~ 2 ' 


(8.103) 


For one particle, initially placed in either well, this gives us our old result (2.185) describing quantum 
oscillations of the particle between two wells with frequency Q. However, Eq. (103) is valid for any set 
of initial occupancies; let us use it. For example, starting from two particles, with initially one particle in 
each well, we get ( Ni ) = 1, regardless of time. So, the occupancies do not oscillate, and no experiment 
may detect the quantum oscillations, though their frequency Q is still formally present in the time 
evolution equations. This fact may be interpreted as the simultaneous quantum oscillations of two 
particles exactly in anti-phase. For bosons, we can go to even larger occupancies by preparing the 
system, for example, in the state with M(0) = N, N 2 ( 0) = 0. Equation (103) says that in this case we see 
that the quantum oscillation amplitude increases A- fold; this is a particular manifestation of the general 
fact that bosons can be (and evolve in time) in the same quantum state. On the other hand, for fermions 
we cannot increase initial occupancies beyond 1, so that the largest oscillation amplitude we can get is if 
we initially fill just one well. 


22 For the second well’s occupancy, the result is complementary, N 2 (t) = Ai(0)sin 2 Of + A2(0)cos 2 fh , giving in 
particular a good sanity check: Ni(t) + A^fi) = M(0) + A^O) = const. 
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The Dirac approach may be readily generalized to more complex systems. For example, an 
arbitrary system of quantum wells with weak tunneling coupling between the adjacent wells may be 
described by Flamiltonian 


H = '^ J £ J aJa j + w a] a r +h.c., (8.104) 

J (jJ') 

where symbol (j,j ’) means that the second sum is restricted to pahs of next-neighbor wells - see, e.g., 
Eq. (2.203) and its discussion. Note that this Flamiltonian is still a quadratic form of the creation- 
annihilation operators, so the Fleisenberg-picture equations of motion of these operators are linear, and 
its exact solutions, though possibly cumbersome, may be studied in detail. Due to this fact, Hamiltonian 
(104) is widely used for the study of some phenomena, for example the very interesting Anderson 
localization effect, in which a random distribution of eigenenergies £j prevents particles within certain 
energy range from spreading to unlimited distances. 23 ’ 24 


8.4, Perturbative approaches 

The situation becomes much more difficult if the problem requires an account of direct 
interactions between the particles. Let us assume that the interaction may be reduced to that between 
pairs - as it is the case at their Coulomb interaction 25 and most other interactions, so that it may be 
described with the following “pair-interaction” Hamiltonian 



1 N 

U iM =~ 

^ k,k'= 1 
k*k' 

(8.105) 

Hair- 

interaction with the front factor of Vi compensating the double-counting of each particle pair. The translation of this 
in two operator to the second-quantization form may be done absolutely similarly to the derivation of Eq. (77), 
alternative a nd gives a similar (though naturally more bulky) result 26 


1 

(8.106) 

where the two-particle matrix elements are defined similarly to Eq. (74): 


K 

Ill 

> 

5' 

> 

(8.107) 


Even in this case, the resulting Heisenberg equations of motion are nonlinear, so that solving 
them and calculating observables from the results is usually impossible, at least analytically. The only 


23 For a review of the ID version of this problem, see, e.g., J. Pendry, Adv. Phys. 43 , 461 (1994). 

24 To complete this section, I have to note, at least in passing, a different form of the second-quantization 
formalism, based on the so-called field operators. It will be more natural for me to discuss it in the next chapter. 

25 Another important example is the so-called Hubbard model in which there may be only two particles on each of 
localized sites, with the negligible interaction of particles on different sites - which are only connected by the 
next-neighbor tunneling - see Eq. (104). 

26 The only new feature is a specific order of the indices of the creation operators. Note the mnemonic rule of 
writing this expression, similar to that for Eq. (79): each term corresponds to moving a pair of particles from 
states l and l ’ to states j ’ and j, factored with the corresponding two-particle matrix element (107). 
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case when some general results may be obtained is the weak interaction limit. In this case the 
unperturbed Hamiltonian contains only single-particle terms such as in Eqs. (71), so we can always (at 
least conceptually :-) find such a basis of orthonormal single-particle states in which that Hamiltonian 
is diagonal in the Dirac representation: 

H { °) =Y J s { pa]d j . (8.108) 

j 


Now we can use Eq. (6.13) in this basis to calculate the interaction energy as a first-order perturbation: 

(8.109) 


E"1 = <A' 1 ,A' J ,...|(/ in ,|A',,iV 2 ,...) = |<iV„iV 2 ,...| £»^)«^,.o,|iV 1 ,Ar 2 ,...) 




= \ X u jftr ’ N 2’— \d]a\,d r a l \ N l , N 2 ,...). 

1 j.j'jj' 


Since, according to Eqs. (81)-(82), the Dirac states with different occupancies are orthogonal, the last 
average yields nonvanishing results only for three particular subsets of the indices: 


(i) j *j’, l =j, and /’ = / ’. In this case the 4-operator product in Eq. (109) equals a j a r d ] ,d j , and 

applying the commutation rules twice, we can bring it to the so-called normal ordering, with each 
creation operator standing to the right of the corresponding annihilation operator, thus fonning the 
particle number operator (64): 


d]d\d ! ,a j =±d]d\d j a j , = +a]\ ±d j a\ la,., = aja,ata,., = N ,-N r . 

J J J J J J J J J \ J J I J J J J J J J ■ 


( 8 . 110 ) 


with the similar sign of the final result for bosons and fermions. 

(ii) j ^ j’,l = /' ’, and / ’ =j. In this case the 4-operator product equals a jaj,a y .a y , , and bringing it to 
the form N fij, requires only one commutation: 


a-Qj.a jQj, = a j 


it 


At; 


± Qja), \a r = ±a'j a j a' j ,a r = ±N jN r , 


( 8 . 111 ) 


with the upper sign for bosons and lower sign for fermions. 

(iii) All indices equal to each other, giving d]a\d v a, = di' ] di' ] d ] d ! . For fermions, such operator 

(that “tries” to create or kill two particles in a row, in the same state) immediately gives the null vector. 
In the case of bosons, we may use Eq. (66) to commute the internal pair of operators, getting 

d)d)d j d ] = d][djd] -/)a 7 = Nj{Nj -/). (8.112) 


Note, however, that this formula formally covers the fermion case as well (always giving zero). As a 
result, Eq. (109) may be rewritten in the following universal form: 



\jL N , N rbw ±“ ro )+7E W - 1 

Z jj' Z j 

i*j' 


(8.113) 


Particle 

interaction: 

1 s, -order 

energy 

perturbation 
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The consequences of this result are very different for bosons and fermions. In the former case, 
the last term usually dominates, because the matrix elements (107) are typically the largest when all 
basis functions coincide. Note that this term allows a very simple interpretation: the number of the 
diagonal matrix elements it sums up for each state (j) is just the number of interacting particle pairs 
residing in that state. 

In contrast, for fermions the last term is zero, and the interaction energy is the difference of two 
terms inside the first parentheses. In order to spell them out, let us consider the case when there is no 
direct spin-orbit interaction. Then vectors \j3)j of the single-particle state basis may be represented as 
products \o)j®\m)j of their orbital and spin orientation parts. (Here, for brevity, I am using m instead of 
fflj.) For spi 11-/2 particles (say, electrons), these orientations m may equal only +1/2 and -1/2; in this case 
the spin part of bra-ket Ujj-jj • equals 

(m\ 0 0 \m') , (8.114) 


where, as in the general Eq. (107), the position of a particular vector in a product codes the particle 
number. Now since spins of different electrons are defined in different Hilbert spaces, we may move 
their vectors around to get 

(m 1 0 (m ' I m) ® | m = {(rn | nij) { x ((m ' | m = 1 , ( 8 . 115 ) 

for any pair of j and j ’. On the other hand, Uj/’/j is proportional to 

(m\<8>(m'\\m')®\m) = x((m'|m)) 2 =S mm ,. (8.116) 


Spin- 

orbital 

functions 


In this case, it is convenient to rewrite Eq. (113) in the coordinate representation, using single- 
particle wavefunctions called spin-orbitals 


V j ( r ) = ( r | P j ) = ( r | ® | m )j ■ 


(8.117) 


Energy 
correction 
due to 
fermion 
interaction 


They differ from the “usual” orbital wavefunctions of the type (5.19) only by that their index j should be 
understood as the set of the orbital state index and the spin orientation index m. 27 Also, due to the Pauli- 
principle restriction of numbers Nj to either 0 or 1, Eq. (113) may be also rewritten without the 
occupancy numbers, with the understanding that the summation is extended only over the pairs of 
occupied states. As a result, Eq. (113) becomes 


£ ™=^zK''j rfV 

jJ; *J; 

Vj (r )v r (r > int (r, r ')y/ ] (r )y/ f (r ') 


1 j.r 

j*f 

Vj ( r )'//■ (r > int (r , r > (r)^ . (r ') 



If, for a system of 2 electrons, we limit the summation to 2 states (J, j ’ = 1, 2), we get the result 
absolutely similar to Eqs. (44)-(45), with the minus sign in Eq. (44). Hence, Eq. (118) may be 
considered as the generalization of the direct and exchange interaction balance picture to an arbitrary 
number of orbitals and an arbitrary total number N of electrons. Note, however, that this equation cannot 


27 Constructs (117) are also close to spinors (14), besides that the spin s of a single particle is fixed, so that the 
spin-orbital should be indexed by spin’s orientation m = m s rather than the full spin .v. Also, the orbital index 
should be clearly distinguished from j (which, again, is the set of that orbital index and m). This is why I believe 
that the frequently met notation of spinors as i// A V (r) may lead to confusion. 
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correctly describe the energy of the excited singlet state, corresponding to the plus sign in Eq. (44). 28 
The reason is that the description of entangled spin states, given by Eq. (19) and the last term of Eq. 
(21), require linear superpositions of different Dirac states, and hence not covered by our assumption 
(108). 


Now comes a very important fact: the approximate result (118), added to the sum of unperturbed 
energies &f ]) , equals the sum of exact eigenenergies of the so-called Hartree-Fock equation: 29 


- — V 2 +w(r) 
2 m 


vM) 


+XJ 

J'*j 


V*v ( r >int ( r > r V, (r )Vr ( r ') - w*v ( r X* (r, r ')y/ r (r)y/ (r) 


d 3 r' = s.y/.(r). 


Hartree- 

(8.119) Fock 

equation 


where u(r) is the external-field potential acting on each particle separately - see Eq. (71). An advantage 
of this equation in comparison with Eq. (118) is that it allows the (approximate) calculation of not only 
the energy of the system, but also the corresponding spin-orbitals, taking into account the electron- 
electron interaction. 


In the limit when the single-particle wavefunction overlaps are small and hence the exchange 
interaction is negligible, the last term in square brackets may be ignored, term y/j( r) may be taken out of 
the integral, and becomes similar to the single-particle Schrodinger equation with the following effective 
potential 

w ef (r) = w(r) + w dir (r), w dir (r) = £ J^*(r> int (r,r' ')y/ r {r')d V . (8.120) 

j'*j 


Hartree 

approximation 


This is the so-called Hartree approximation - that gives reasonable results for some systems. 30 However, 
in dense electrons systems (such as typical atoms, molecules, and condensed matter) the exchange 
interaction, described by the second term in the square brackets of Eq. (119), is typically of the order of 
30% of the direct interaction, and frequently this effect cannot be ignored. In this case, Eq. (119) is an 
integro-differential rather than just differential equation. 


There are efficient methods of numerical solution of such equations, typically based on iterative 
methods, though they require large memory and CPU-cycle resources even for systems of -10“ 
electrons. 31 This is why the Hartree-Fock approximation is the de-facto baseline of all so-called ab-initio 


28 Note that due to condition j’ ^j, and Eq. (116), the exchange interaction is limited to electron state pairs with 
the same spin direction - again in a good correspondence with the triplet states (like tt or f f ) of a two-electron 
system, in which the contribution of E ex (8.45b) to the total energy is also negative. 

29 This equation was suggested in 1929 by D. Hartree for the direct interaction, and extended to the exchange 
interaction by V. Fock in 1930. In order to verify its equivalence to Eq. (118), it is sufficient to multiply all terms 
of Eq. (119) by ys*j(r), integrate them over all r space (so that the right-hand part would give £j), and then sum 
these single-particle energies over all occupied states j. 

30 An extreme expression of the Hartree approximation is the very simple Thomas-Fermi model of heavy atoms 
(with the atomic number Z » 1), in which the gradient of the electrostatic potential is also neglected, i.e. the 
atomic electrons are treated essentially as an ideal Fermi gas - see SM Chapter 3. 

31 Surprisingly, this is sufficient to describe, with reasonable accuracy, many properties of condensed matter, by 
breaking it to similar elementary spatial cells (say, Bravais cells of crystals), with cyclic boundary conditions and 
a limited number of electrons in each cell. 
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(“first-principle”) calculations in condensed matter physics and quantum chemistry. 32 In departures from 
this baseline, there are two opposite trends. For larger accuracy (and typically smaller systems), several 
“post-Hartree-Fock methods”, notably including the configuration interaction method, 33 that are more 
complex but may provide higher accuracy, have been developed. 


There is also a strong opposite trend of extending ab-initio methods to larger systems, while 
sacrificing the result accuracy and reliability. This trend is currently dominated by the Density 
Functional Theory , 34 universally known by its acronym DFT. In this approach, the equation solved for 
each eigenfunction y/fr) is a differential, Schrodinger-like Kohn-Sham equation 


Kohn- 
Sham 
equation 
and its 
components 


h 2 

-—V 2 +u(r) + u™(r)-u xc 
2m 


(r) 


y / J (r) = 6' J y / j(r), 


( 8 . 121 ) 


where 



“dir S (r) = e#(r), </>(r) = ^ J 

\df’- 

Pi r') 

r -r' 

, p{ r) = -en( r), 

(8.122) 

and n( r) is the total electron density i 

a a particular point, calculatec 
n(r)s^^*(r)^ 7 (r). 

[ as 

(8.123) 


j 


The effective exchange-correlation potential ux c(r) (that differs from the genuine exchange 
potential, participating in Eq. (121), by the inclusion of the term with j = j’) is calculated in various 
approximations, most valid only asymptotically in the limit when the electron number is high. The 
simplest of them is the Local Density Approximation (LDA) in which the effective exchange potential at 
each point is a function only of the electron density (123) at the same point, taken from the theory of a 
uniform gas of free electrons. 35 Another simplification, that dramatically cuts the computing resources 
necessary for systems of relatively heavy atoms, is the exclusion of the filled internal electron shells (see 
Sec. 3.7) from the explicit calculations, because the shell states are virtually unperturbed by the valence 
electron effects involved in typical atomic phenomena and chemical reactions. In this approach, the 
Coulomb field of the shells, described by fixed, pre-calculated and tabulated pseudo-potentials, added to 
that of the nuclei. Unfortunately, because of lack of time, for details I have to refer the reader to 
specialized literature. 36 


32 See, e.g., A. Szabo and N. Ostlund, Modern Quantum Chemistry, McGraw-Hill, 1989. 

33 That method, in particular, allows the calculation of proper linear superpositions of the Dirac states (such as the 
excited singlet state for N= 2, discussed above) which are missing in the generic Hartree-Fock approach. 

34 It was developed by W. Kohn and coauthors in the mid-1960s, and eventually (in 1998) awarded with a Nobel 
prize in chemistry. 

35 For a uniform, degenerate Fermi-gas of electrons (with the Fermi energy s ¥ » k B T), the exchange potential 
may be calculated analytically, giving w ex = (3l4d)e 2 k ¥ l47T£o, where k ¥ is the Fermi-surface wave number that 
defines both the Fermi energy s ¥ = (tik v ) 2 /2m and the electron density (per unit volume) n = 2(4zz/3)Ar F 3 /(2zr) 3 = 
kfl 3f. 

36 See, e.g., G. te Velde et al., J. Comp. Chem. 22, 931 (2001), and/or M. D. Segall et al., J. Phys. - Cond. Matt. 
14, 2717 (2002), and references therein. 
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Let me, however, emphasize that despite the wide use of the DFT, 37 and its undisputable 
successes in describing some experimental data, it has problems. For me personally, its largest 
conceptual deficiency is the incorporation of the absolutely unphysical Coulomb interaction of an 
electron with itself (by dropping condition j ’ ^ /). As a result, existing DFT packages require substantial 
artificial tinkering to use them for description of such processes as single-electron transfer. 38 A little bit 
light-heartedly (but still correctly), one may say that an advanced DFT software package, run on a huge 
supercomputer, cannot be used to calculate the correct energy spectrum of a hydrogen atom - a century 
after this had been done by Niels Bohr on a slip of paper! 


8.5. Quantum computation and cryptography 

Now I have to review the emerging fields of quantum computation and encryption , 39 These fields 
are currently the subject of a very intensive research effort, which has brought (besides much hype :-) a 
few results of genuine importance for quantum mechanics. My coverage, by necessity short, will 
emphasize these fundamental results, referring the reader interested in details to special literature. 40 
Because of the active stage of the fields, I will also provide quite a few references to recent publications, 
making the style of this section closer to a brief research review than to a part of a textbook. 

Presently, the work on quantum computation and encryption is focused on systems of spatially- 
separated (and hence distinguishable ) two-level systems - in this context, commonly called qubits. 41 
Due to this distinguishability, the issues that were the focus of the past few sections (including the 
benefits of the second quantization) are irrelevant here. On the other hand, systems of distinguishable 
qubits have some interesting properties that had not been yet discussed in this course. 

First of all, a system of N » 1 qubits may contain much more information than the N classical 
bits - which is the maximum information capacity of N classical bistable systems. Indeed, according to 
the discussions in Chapter 4, an arbitrary pure state of a single qubit may be represented by its ket vector 
(4.37) - see also Eq. (5.1): 

|a)^ =1 = a x \u x ) + a 2 \u 2 ) , (8.124) 

where {u} is any orthonormal two-state basis. In the quantum information theory, it is natural and 
common to employ, as Uj, the eigenstates aj of the observable A that is eventually measured in the 
particular physical implementation of the qubit - say, a certain spatial component of spin- A particle, etc. 
It is also common to write the kets of these base states as |0) and |1), so that Eq. (124) takes the form 42 


37 This popularity is enhanced by the availability of several advanced DFT software packages, some of them (such 
as SIESTA, http://icmab. cat/leem/siesta/~) in public domain. 

38 See, e.g., N. Simonian et al., J. Appl. Phys. 113 , 044504 (2013). 

39 Since these fields are much related, they are often referred to together, under the (somewhat misleading) title of 
“quantum information”. 

40 Despite many recent book titles in the field, one of its first surveys, by M. Nielsen and I. Chuang, Quantum 
Computation and Quantum Information , Cambridge U. Press, 2000, is perhaps still the best one. 

41 In some texts, the term qubit (or “Qbit”, or “Q-bit”) is used instead for the information contents of a two-level 
system - very much like the classical bit of information (in this context, frequently called “Cbit” or “C-bit”) 
describes the information contents of a classical bistable system - see, e.g., SM Sec. 2.2. 

42 The slightly odd aspect of this notation is that at the Bloch sphere representation (Fig. 5.1), the North Pole state 
(that is traditionally denoted as T in other fields of quantum mechanics) is taken for 0, while the South Pole state 
I for 1, so that Eqs. (5.4) take the form a 0 = cos(/12), a\ = sin(0/2)exp{/<^}. 
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a ) N =x = a o 

0 ^ + £7j 

l ) = I>; 

7=0,1 

7 % 


(8.125) 


where in the rest of this chapter, letter j will be used to denote an integer equal to either 0 or 1 . Hence 
any pure state a of a qubit is completely defined by two complex onumbers aj, i.e. by 4 real numbers. 
Moreover, due to the normalization condition |ai| + \a, 2 \ = 1, we need just 3 independent real numbers - 
say, the Bloch sphere coordinates 6 and (p (see Fig. 5.1), plus the common phase y, which becomes 
important when we consider coherent states of several qubits - see Eq. (5.3). 


Two-qubit 

state’s 

representation 


Now, if we have a system of 2 qubits, its arbitrary pure state (4.37) may be represented as a sum 
of 2 2 = 4 terms, 43 



a ) N = 2 a °° 

00) + (ZQ]|01^ + (7jg 

1 0) + ci \ j 

u )= X a j\j 2 
./„./ 2 =0,1 

jJl)’ 


(8.126) 


with 4 complex coefficients, i.e. 4x2 = 8 real numbers, subject to just one normalization condition 44 


I 

./j, 2=0,1 



= 1 . 


(8.127) 


An evident generalization of Eqs. ( 125)-( 126) to an arbitrary pure state of an /V-qubit system is 
given by a sum of 2 N terms: 


a),, = 


Z 




a J Jr- Jn 




(8.128) 


including all possible combinations of Os and Is inside the ket, so that the state is fully described by 2 N 
complex numbers, i.e. 2-2 N = 2 V 1 real numbers, with only one constraint, similar to Eq. (127), imposed 
by the normalization condition. Let me emphasize that this exponential growth of the information 
contents would not be possible without the qubit state entanglement. Indeed, in the particular case when 
qubit states are unentangled (separable), 

\ a ) N = K)l a 2)-|«w)> (8.129) 


where each | a„) is described by an equality similar to Eq. (125) with its individual expansion 
coefficients, the system state description requires only 3 N real numbers - e.g., N sets {6*, q>, y\ . 

However, it is wrong (as it is sometimes done in popular reviews) to project this exponential 
growth of information contents directly on the capabilities of quantum computation, because this 
process has to include the output information readout, i.e. qubit state measurements. Due to the 
fundamental intrinsic uncertainty of quantum systems, the measurement of a single qubit even in a pure 
state (125) generally gives uncertain results, with probabilities W 0 = |<7o| 2 and W\ = \ci\ . In order to 
comply with the general notion of digital computation, a quantum computer has to provide certain (or 


43 Here and in most instances below I use the same shorthand notation as was used in the beginning of this chapter 
- cf. Eq. (8.1). In this short form, qubit’s number is coded by the order of its state index inside the single ket- 
vector, while in the long form, such as in Eq. (129), it is coded by the order of the ket-vector. 

44 It follows from the requirement that the sum of two probabilities Wj = (« | A | a'j (where A = | ./)(./ is the 

corresponding projection operator, see Sec. 4.5) to find one of qubits in one of its two possible states j, equals 1. It 
is remarkable that the application of this condition to any of the qubits results in the same Eq. (127). 
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virtually certain) results, and hence probabilities Wj have to be very close to either 0 or 1, so that before 
the measurement, each qubit has to be in a basis state - either 0 or 1 . This means that the computational 
system of N qubits, just before the final readout, has to be one of the basis states 

\ a ) N = \jx)\ji)-\j n) = \jxji-j n) ^ (8.130) 

which is a very small subset even of the set (129) of all unentangled states, and whose maximum 
information contents in just N classical bits. 

Now the reader may start thinking that this constraint strips quantum computations of any 
advantages over their classical counterparts, but this view is also superficial. In order to show that, let us 
consider the scheme of the most frequently explored type of quantum computation, shown in Fig. 2. 45 


r 



classical 

bits 

of the J 
input 
number 



V 




classical 
bits 
of the 
output 
number 


Fig. 8.2. The baseline scheme of quantum computation. 


Here each horizontal line (sometimes called a “wire” 46 ) corresponds to a single qubit, tracing its 
time evolution in the same direction as at the usual time function plots: from left to right. This means 
that the left column \a)- m of ket-vectors describes the initial state of qubits, 47 while the right column |a) ou t 
describes their final (pre-detector) state. The box labeled U represents the qubit evolution in time due to 


45 Numerous modifications of this baseline scheme have been suggested, for example with the number of output 
qubits different from that of input qubits, etc. Some other options are discussed in the end of this section. 

46 The notion of “wires” stems from the similarity between these diagrams and the drawings used to describe 
classical computation circuits (see, e.g., Fig. 3a below); in the latter case the lines may be indeed understood as 
physical wires connecting physical devices: logic gates and/or memory cells. In this context note that classical 
computer components also have nonvanishing time delays, so that even in this case the left-to-write device 
ordering is useful to indicate the timing of (and frequently the causal relation between) the signals. 

47 As we know from Chapter 7, the preparation of pure state (125) is (conceptually :-) straightforward. Placing a 
qubit into a weak contact with an environment of temperature T « A/7c B , where A is the difference between 
energies of eigenstates |0) and jl), we may achieve its relaxation into the lowest-energy state. (Otherwise, the 
relaxation may be to one of states with equal, or nearly-equal energies, combined with its measurement - see Fig. 
7.8 and its discussion.) Then, if the qubit must be set into the opposite state, it may be driven there by the 
application of a pulse of a proper external classical “force”. For example, if actual spin- 'A particles are used as 
qubits, a constant magnetic field may be applied in the [ x , y] plane for a half-period of the torque-induced spin 
precession - see Fig. 5.1c. However, for most qubit implementations, the basis state reversal using a half-period 
of rf-induced Rabi oscillations (Sec. 6.5) is more convenient. 
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their specially arranged interactions between each other and/or external drive “forces”. Besides these 
forces, during this evolution the system is supposed to be isolated from the dephasing and energy- 
dissipating environment, so that it may be described by a unitary operator defined in the 2 v -dimensional 
Hilbert space of N qubits: 

|a) out =U\a) m . (8.131) 

With the condition that the input and output states have the simple fonn (130), this equality reads 

| O'l L C/2 )out • • -On )out ) = U | (/ )in C/2 )in • • (j N In}' ( 8 ' 1 32 ) 

The art of quantum computer design is selecting such unitary operators U that would: 

- satisfy Eq. (132), 

- be physically implementable, 

- enable substantial perfonnance advantages of the quantum computation over its classical 
counterpart of similar functionality, at least for some digital functions (algorithms). 

I will have time to demonstrate the possibility of such advantages on just one, perhaps the 
simplest example - the so-called Deutsch problem , 48 Let us consider the family of single-bit classical 
Boolean functions j ou t = Since both j are Boolean variables, i.e. may take only values 0 and 1, there 

are evidently only 4 such functions: 


/ 
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1 

1 
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0 

0 


(8.133) 


Of them, functions f\ and _/, whose values are independent of their arguments, are called constants, 
while functions f 2 (called “YES” or “IDENTITY”) and / 3 (“NOT” or “INVERSION”) are called 
balanced. The Deutsch problem is to determine the class of a single-bit function, implemented as a 
“black box”, as being either constant or balanced, using just one experiment. 

Classically, this is clearly impossible, and the simplest way to perform the function classification 
involves two similar black boxes/- see Fig. 3a. 



Fig. 8.3. The simplest (a) classical and (b) quantum ways to classify a single-bit Boolean function f 


48 Named after D. Deutsch, whose 1985 paper (motivated by an inspirational but not very specific publication by 
R. Feynman in 1982) launched the whole field of quantum computation. 
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This solution uses the so-called exclusive-OR (for short, XOR) gate whose output is described by 
the following function F of its two Boolean arguments / and /: 


FUxJi) = jx ©7' 2 


|0, if 71 = 72 » 
U» if 7 1 ^ ji • 


(8.134) 


In the circuit shown in Fig. 3a, the gate produces output 

F = /(0)©/(l), (8.135) 

equal to 1 if/(0) ^ /( 1), i.e. if function /is balanced, and 0 in the opposite case - see the 4 th column in 
Eq. (133). 49 

On the other hand, let us assume that all four functions / may be implemented quantum- 
mechanically, for example as a unitary transform acting on two qubits (Fig. 4a), and acting as follows 
each of basis components J/ 1 / 2 ) = \j\)\ji) of the general input state (126): 

/U >7 H27I A® /<■/,)). (8.136) 


where/ is any of the classical Boolean functions defined by Eq. (133). 


(a) 


J\) 

J\) 

U) 

j\) 
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J2) 


j 2 ©/(/)) 

h) ^ 


7 2 ®7i 


(b) 


Fig. 8.4. Two-qubit quantum gates: (a) 
two-qubit function / and (b) its particular 
case C (CNOT), and their actions on the 
basis states. 


In the particular case when / is the YES function: fij) = />(/) = j, gate / is reduced to the so- 
called CNOT gate - a key ingredient of other quantum computation schemes, performing transfonn 


| 7i7 2 ) = I 7i )| 7 2 © 7i ) • 

Let us spell out this rule for all four possible input qubit combinations: 

C|00) = |00), C|0l) = |0l), C|10) = |11), C|11) = |10). 


(8.137a) 

(8.137b) 


In plain English, this means that acting on basis states J/ 1 / 2 ), the CNOT gate leaves the state of first, 
source qubit (shown by the upper lines in Fig. 4) intact, but flips the state of the second, target qubit if 
the first one is in the basis state |1). In even simpler words, the state / of the source qubit controls the 
NOT function acting on the target qubit - hence the gate’s name CNOT (the semi-acronym of 
“Controlled NOT”). 


49 Alternatively, we may perform two sequential experiments on the same black box / first recording and then 
recalling their results. 


CNOT 

function 
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Hadamard 

transform 


For the quantum function (136), the Deutsch problem may be solved within the general scheme 
shown in Fig. 2, with the particular structure of the unitary-transform box U spelled out in Fig. 3b, 
which involves just one implementation of the function. Here the singe-qubit quantum gate 7V 
symbolizes the so-called Hadamard (or “Walsh-Hadamard“) transform 50 whose linear operator is 
defined by the following actions on qubit’s basis states: 



1 

VI 


(|o)+|i», 4> 


1 




(8.138) 


- see also the 4 left state labels in Fig. 3b. 51 On the Bloch sphere (Fig. 5.1), and in the usual spin -!/2 
notation, Eqs. (137) correspond to the transfer of the representing point from the North Pole’s state T, 
i.e. one of the eigenstates of matrix cr z , to one of equatorial states, — i.e. one of the eigenstates of 
matrix a v , and from the South Pole state i to the another equatorial state, , see Eq. (4.122). However, 
a ^/2-rotation in the [x, z ] plane would be a poor interpretation of this function. Indeed, since its operator 
has to be linear (to be physically realistic), it needs to perform action (138) on the basis states even 
when they are parts of an arbitrary linear superposition - as they are, e.g., for the two right Hadamard 
gates in Fig. 3b. For example, as immediately follows from Eq. (137) and operator’s linearity, 



V2l V2 


yoH 1 »+-Mo}-|l»] = |0)> (8.139a) 


4l 


Absolutely similarly, we may get 52 

*(*|i))=|i>- 


(8.139b) 


Due to this reason, a better interpretation of the Hadamard transfonn is a ^-rotation about the axis that 
bisects the angle between axes x and z. 


Now let us carry out an analysis of the “circuit” shown in Fig. 3b, minding all the time the 
operator linearity, and the fact that the transformation rules ( 1 36)-( 138) are only applicable to basis kets 
of the initial (“input”) state vector. In particular, taking into account that according to Fig. 3b, the input 
states of gate / in this particular circuit are described by Eqs. (138), its output state’s ket is 


M 0)*1 1>) = f[^ (| 0} + 1 !))4r (I °> - - 1 '})] = \ (/1 0°) - f\ 0 1} + 71 1 0) - f\ 1 !>)• (8.140) 


42 


Now we may apply Eq. (136) to each of the basis kets to get: 

f\ 00 ) - f\ 0 1 ) + f\ 1 0 ) - f\ 1 1 ) = f\ 0 ) 1 0 ) - f\ 0 ) 1 1 ) + f\ 1 ) 1 0 ) - f\ 1 ) 1 1 ) 

= 1 0)1 0 © m) - 1 0)1 1 © /(0)> + 1 1)1 0 © m) - 1 1)1 1 © /( 1)) (8.141) 

= 1 0)(j 0 © /(0)) - 1 1 © /(0))) + 1 l)(j 0 © /(!)) - 1 1 © /(!)))• 


50 In order to exclude any chance of confusion between the Hadamard transform’s operator 74 and the 
Hamiltonian operator H , I have typeset them using different fonts. 

A /V 

51 Note that according to Eq. (138), operator 74 does not belong to the limited class U described by Eq. (132). 

52 Since states 0 and 1 fonn a full basis of the single qubit, Eqs. (139) may be summarized as an operator 
equality: 77 2 = I . 
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Note that the expression in the first parentheses, characterizing the state of the target qubit, is equal to 
(|0> - 11» = (-1)° (|0> - 11» if/0) = 0 (and hence 0©/(0) = 0 and 10/0) = 1), and to (|1) - 10» = (-l^flO) - 
|1)) in the opposite case/(0) = 1, so that both cases may be described in one shot by rewriting the 
parentheses as (-1) /(0) (|0) - 1 1 )). The second parentheses is absolutely similarly controlled by the value of 
J[\), so that the state of the system at the output of gate/ is unentangled again: 

1 

S 


(]0)-|l)), (8.142) 


/(ft'|0)^|l))= I((-1) /<0> |0) + (-1) 


/(l) I 


10 - 


)-|i»=+-N(o> + (-i>-|i» 


where the last transition has used the fact that the Boolean function F, defined by Eq. (135), equals to 
±[/(l) — /(0)] - compare the last two columns in Eq. (133). Since the common sign (i.e. the common 
phase shift by jz) is inconsequential, it may be prescribed to any of the component ket-vectors - for 
example to that of the target qubit, as shown by the third pair of state labels in Fig. 3b. 

This intermediate result is already rather remarkable. Indeed, it shows that, despite the 
impression one could get from Fig. 4, gates / and even C, being “controlled” by the source qubit, may 
change that qubit’s state as well! This fact (partly reflected by the vertical direction of the control lines 
in Figs. 3, 4, symbolizing the same stage of system’s evolution in time) shows how careful one should 
be interpreting quantum-computational “circuits”. 


At the second stage of the circuit shown in Fig. 3b, the qubit components of state (142) are fed 
into one more pair of Hadamard gates, whose outputs therefore are 




1 >)= 


and 


V 




4i 


Now using Eqs. (138) again, we see that the output state ket-vectors of the source and target qubits are, 
respectively, 

1 + ( ~^ |0)+ 1 -^ |1), and ±|1). (8.144) 


Since, according to Eq. (135), the Boolean function F may take only values 0 or 1, the final state of the 
source qubit is always one of its basis states j, namely the one with j = F. Its measurement (see Fig. 2) 
immediately tells us whether function/, participating in Eq. (136), is constant or balanced. 53 

Thus, the quantum circuit shown in Fig. 3b indeed solves the Deutsch problem in one shot. 
Reviewing our analysis, we may see that this is possible because the unitary transform performed by 
gate/is applied to quantum superpositions (138) rather than to the basis states. Due to this trick, the 
quantum state components depending on/(0) and /(l) are processed simultaneously, in parallel. This 
quantum parallelism may be extended to circuits with many (N » 1) qubits and, for some tasks, 
provide a dramatic performance increase - for example, reducing the necessary circuit component 
number from 0(exp{ N}) to 0(N P ), where p is a finite (and not very big) number. 


53 This means that the last Hadamard transform of the target qubit (i.e. the Hadamard gate shown in the lower 
right comer of Fig. 3b) is not necessary for the Deutsch problem solution - though it should be included if we 
want the whole circuit to satisfy the general condition (132). 
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However, this efficiency comes at a high price. Indeed, let us discuss the physical 
implementation of quantum gates, starting from the Hadamard gate, which performs a single-qubit 
transfonn - see Eq. (138). With the linearity requirement, its action on the arbitrary state (125) should be 

1 

71 

meaning that the state expansion coefficients in the end (t = T) and beginning (t = 0) of the qubit 
evolution in time have to be related as 


(o)-|l)) = -Wao +a l )0) + -j^(a ( , -a,)|l), (8.145) 


W\ a) = ajj |0) + |l) = a 0 -=(|0) + |l))+ a 1 

\2 


a 0 (T) 


a 0 (0) + a,(0) 


a AT) 


«o(0)-«i(0) 


(8.146) 


This task may be again performed using the Rabi oscillations, which were discussed in Sec. 6.5, 
i.e. by applying to the qubit (a two-level system), for a limited time period T, a weak sinusoidal external 
signal of frequency a> equal to the intrinsic quantum oscillation frequency co nn - defined by Eq. (6.85). A 
perturbative analysis of the Rabi oscillations was carried out in Sec. 6.5, even for nonvanishing (though 
small) detuning A = a> - co„ n , but only for the particular initial conditions when at t = 0 the system was in 
one on the basis states (there labeled as n ’), i.e. another state (there labeled n) was empty. For our 
current purposes we need to find coefficients ao,i(0 of expansion (125) for arbitrary initial conditions 
ao,i(0), subject only to the time-independent normalization condition |ao| + |oi|" = 1. For the case of 
exact tuning, A = 0, the solution of Eqs. (6.94) is elementary, and gives, instead of Eq. (6. 102), 54 the 
following solutions: 


a 0 (t) = a 0 (0) cos Clt -ia x {Q)e i(p sin Clt, 
a x (t) = a, (0) cos Clt - ia 0 ( 0)e~ l<p sin Clt, 


(8.147) 


where Q is the Rabi oscillation frequency (6.101), in the exact-tuning case proportional to amplitude \A\ 
of the external rf drive A = \A\cx${i(p), while (p is the phase of the driving signal - see Eqs. (6.86)- 
(6.87). Comparing these expressions with Eqs. (146), we see that for t = T = nlACl and (p = nil they 
“almost” coincide, besides the opposite sign of a\{T). 

Conceptually the simplest way to correct this deficiency is to follow the rf “^/4-pulse”, just 
discussed, by a short dc “/r-pulsc” of duration T’ = n!8, which temporary creates an small additional 
energy difference 8 between basis states 0 and 1. According to the basic Eq. (1.61), such difference 
creates an additional phase difference T’S/ti between the states, equal to n for the “/r-pulsc”. 

Another way (that may be also useful for two-qubit operations) is to use another, auxiliary 
energy level £7 whose distances from the basic levels E\ and Eq are significantly different from the 
difference (E\ - Eq) - see Fig. 5a. In this case, the weak external rf field tuned to any of 3 potential 
quantum transition frequencies co nn ■ = (E„- E n •)/ fi initiates such transitions between the corresponding 
states only, with a negligible perturbation of the state not involved in this transition. Such transitions 
may be again described by Eqs. (147), with the appropriate index changes. For the Hadamard transfonn 
implementation, it is sufficient to apply (after the already discussed ^/4-pulse of frequency co\q, and with 


54 To comply with our current notation, coefficients a „■ and a„ of Sec. 6.5 should be replaced with a 0 and a\. Also 
note that their definition (6.82) implies that the trivial time evolution (6.81) of unperturbed qubits has been 
already excluded from these expansion coefficients. 
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the initially empty level Ef), an additional /r-pulsc of frequency 0 ) 20 , with any phase cp. Indeed, according 
to the first of Eqs. (147), with the due replacement ai(0) — > «2(0) = 0, such pulse flips the sign of 
coefficient ao(t), while coefficient a\(t), not involved in this additional transition, remains unchanged. 
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Fig. 8.5. Energy-level schemes used for unitary transformations of (a) single qubits and (b, c) two-qubit systems. 


Now let me describe the conceptually simplest (though, for some qubit types, not practically 
most convenient) scheme for the implementation of the CNOT gate, whose action is described by a 
linear unitary operator satisfying Eq. (137). For that, evidently, qubits have to let interact for some time 
T. As was repeatedly discussed in two past chapters, in most cases such interaction of two subsystems is 
bilinear - see, e.g., Eq. (6.148). For qubits, i.e. two-level systems, each of the component operators may 
be represented by a 2x2 matrix in the basis of states 0 and 1. According to Eq. (4.105), such matrix may 
be expressed as a linear combination (col + c-a), where Co and three Cartesian components of vector c 
are c-numbers. Let us take such bilinear interaction Hamiltonian in the simplest form 

(8.148) 

0, otherwise, 

where the upper index is the qubit number, and k is a c -number constant. 55 According to Eq. (4.175), 
by the end of the interaction period, this Hamiltonian produces the following unitary transform: 




fi ! 




(8.149) 


Since in the basis of unperturbed two-bit states J/ 1 / 2 ) the product operator oy-'cr 2 * is diagonal, so is the 
unitary operator (149), with the following action on the basis states: 


U in t | jj 2 ) = exp{/ Oafaf 1 } | jj 2 ), 


(8.150) 


55 The assumption of simultaneous time independence of the basis state vectors and the interaction operator 
(within the time interval 0 < t < T) is possible only if the basis state energy difference A of both qubits is exactly 
the same. For this case, the simple physical explanation of the time evolution (149) follows from Fig. 8.5, which 
shows the spectrum of the total energy E = E\ + E 2 of the two-bit system. In the absence of interaction, the 
energies of two basis states, |01) and 1 10), are equal, enabling even a weak qubit interaction to cause their 
substantial evolution in time - see Sec. 6.7. If the qubit energies are different (Fig. 5c), the interaction 
may still be reduced, in the rotating-wave approximation, to Eq. (149), by compensating the energy 
difference (Ai - A 2 ) with an external rf signal of frequency co = (Ai - A 2 )/h - see Sec. 6.5. 
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where 9 = -kT/Ti, and o z are the eigenvalues of the Pauli matrix a z for the basis states of the 
corresponding qubit: a z = +1 for J/) = |0>, and cr, = -1 for J /') = |1). Let me, for clarity, spell out Eq. (150) 
for the particular case 9 = -n!A (corresponding to the qubit coupling time T= nfi!AK)\ 


f„|00) = e -“ ,4 |00), (7 in ,|01) 


= e“ ,4 |01>, £> in ,|l0) = e' Vr/4 |lO>, C/ ml |ll) = e -" ,4 |ll>. (8.151) 


In order to compensate the undesirable parts of this joint phase shift of the basis states, let us 
apply (either before or after it) similar individual “rotations” of each qubit by angle 9’ = +/z/4, using the 
following product of two independent operators, plus (just for the result clarity) a common, and hence 
inconsequential, phase shift 9” = -n! 4: 56 


^com = /exp{/6>'(of > + of ) )+/£"} = /expji^o-f lexpU^afHexp 



(8.152) 


Since this operator is also diagonal in the 1/ 1 / 2 ) basis, it is equally easy to calculate the change of the 
basis states by the total unitary operator U t = U com U mt : 


[7, 1 00) = 1 00), c7,|0l) = |0l), C/,|10) = 1 10 ), t/,|ll) = -|ll). (8.153) 

This result already shows the main “miracle action” of two-qubit gates, such as shown in Fig. 4: 
the source qubit is left intact (only if it is in a basis state!), while the state of the target qubit is altered. 
True, this is still different from the CNOT operator’s action (137), but may be readily reduced to it by its 
sandwiching of transform U t between two Hadamard transfonns applied to the target qubit: 

C = ^# (2) U,# ( 2) . (8.154) 

We have spend quite a bit of time on the discussion of the CNOT gate, 57 and now I can reward 
the reader for his/her effort with a bit of good news: it has been proved that an arbitrary unitary 
transfonn that satisfies Eq. (132), i.e. may be used within the general scheme outlined in Fig. 2, may be 
decomposed into a set of CNOT gates mixed with simpler single-qubit gates - for example, the 
Hadamard gate plus the nil rotation discussed above. 58 Unfortunately, I have no time for a detailed 
discussion of more complex circuits. 59 Perhaps the most famous of them is the scheme for integer 


56 It Eq. (4.175) shows, each of component unitary transforms / exp {i 9' &_} may be created by applying to each 

qubit, for a time period T’ = %97k’, a constant external field described by Hamiltonian H = —k’ct. . We already 
know that for a charged, spin- 1 // particle, such Hamiltonian may be created by applying z-oriented external 
constant magnetic field - see Eq. (4.163). For most other physical implementations of qubits, the organization of 
such Hamiltonian is also straightforward - see, e.g., Fig. 7.4 and its discussion. 

57 As was discussed above, this gate is identical to quantum gate / for f=fi, i .e.flj) =j. The implementation of / 
for 3 other functions / requires straightforward modifications whose analysis is left for reader’s exercise. 

58 This fundamental importance of the CNOT gate was perhaps a major reason why D. Wineland, the leader of the 
NIST group that had demonstrated the first experimental implementation in 1995 (following the theoretical 
suggestion by J. Cirac and P. Zoller), was awarded the 2012 Nobel Prize (shared with S. Haroche, the leader of 
another leading group working towards quantum computation). 

59 For that, the reader may be referred to either the monograph by Nielsen and Chuang, cited above, or to a shorter 
(but more formal) textbook by N. Mermin, Quantum Computer Science, Cambridge U. Press, 2007. 
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number factoring, suggested in 1994 by P. Shor. 60 Due to its potential practical importance for breaking 
broadly used communication encryption schemes such as the RSA code, 61 this opportunity has incited a 
huge wave of enthusiasm, and triggered experimental efforts to implement quantum gates and circuits 
using a broad variety of two-level quantum systems. Presently, the following options are most eagerly 
pursued: 62 

(i) Trapped ions . The first experimental demonstrations of quantum state manipulation 
(including the already mentioned first CNOT gate) have been carried out using deeply cooled atoms in 
optical traps, similar to those used in frequency and time standards. Their electron spins are natural 
qubits, whose states may be manipulated using the Rabi transfers excited by suitably tuned lasers. The 
spin interactions with environment may be very weak, resulting in large dephasing times (73, see Sec. 
7.3), up to a few seconds. Since the distances between atoms in the traps are relatively large (of the 
order of a micron), their direct spin-spin interaction is even weaker, but atoms may be made effectively 
interacting either via their mechanical oscillations about the potential minima of the trapping field, or 
via photons in electromagnetic resonators (“cavities”). 63 Perhaps the main challenge of using this 
approach for quantum computation is poor “scalability”, i.e. the enormous challenge of creating large, 
ordered systems of individually addressable qubits. 

(ii) Nuclear spins are also typically very weakly connected to environment, with 73 exceeding 10 
seconds in some cases. Their eigenenergies Eo and E\ may be split by external dc magnetic fields 
(typically, of the order of 10 T), while the interstate Rabi transfers may be readily achieved by 
application of external rf fields with frequencies co = (E\ - Ef/li of a few hundred MHz. 64 The 
challenges of this option include the weakness of spin-spin interactions (typically mediated through 
molecular electrons), resulting in a very slow spin evolution, whose time scale tihc may become 
comparable with 73, and small level separations E\ - Eo, corresponding to a few K, 65 i.e. much smaller 
than the room temperature, creating a problem with qubit state preparation. 66 

Despite these challenges, the nuclear spin option was used for the first implementation of the 
Shor algorithm for factoring of a small number (15 = 5x3) as early as in 2001. 67 However, the extension 
of this success to larger systems, beyond the set of spins inside one molecule, is problematic. 

(iii) Josephson-junction devices . Much better scalability may be achieved with solid state 
devices, especially in superconductor integrated circuits including weak contacts - Josephson junctions. 
As was already discussed in Sec. 2.8, if the coupling of a Josephson junction to its dissipative 
environment is sufficiently weak (in particular if its effective parallel resistance is much higher than the 


60 His original paper was published only in proceedings of a meeting, but a clear description of the algorithm may 
be found in several accessible sources including Wikipedia (http://en.wikipedia.org/wiki/Shor’s algorithm) . 

61 Named after R. Rivest, A. Shamir, and L. Adleman, the authors of the first open publication of the code in 
1977, but actually invented earlier (in 1973) by C. Cocks. 

62 For more details, and a discussion of other possible implementations (such as quantum dots and dopants in 
crystals) see, e.g., T. Ladd et al.. Nature 464, 45 (2010), and references therein. 

63 A brief discussion of such interactions (so-called Cavity QED) will be given in Sec. 9.4 below. 

64 In this field, the condition co = coio, discussed above, is called the nuclear magnetic resonance, or 
NMR - the term well known due to the broad application of this effect in chemistry and medicine. 

65 See Eq. (4.5) and its discussion. 

66 This challenge may be partly mitigated using ingenious spin manipulation techniques such as refocusing - see, 
e.g., either Sec. 7.7 in Nielsen and Chuang, or J. Keeler’s monograph cited in the end of Sec. 6.5. 

67 B. Lanyon et al., Phys. Rev. Lett. 99, 250505 (2001). 
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quantum resistance unit Rq ~ 10 4 Q), the Josephson phase variable (p behaves as a coordinate of a ID 
quantum particle with effective mass (2.252), moving in a 2;r-periodic potential - see Eq. (2.250). This 
fact creates several opportunities for qubit implementation using quantum behavior of this macroscopic 
degree of freedom. 

In an insulated junction, 68 the phase motion in the periodic potential U( (p) = -Ejcostp creates the 
energy band structure E(q) that was discussed in detain in Sec. 2.7. In particular, in the weak potential 
limit (which, for the Josephson junction case, is valid at Ej « e /2C - see the discussion in Sec. 2.8), 
the lowest bandgaps are very narrow, and function E(q) in their vicinity is well described by the usual 
level anticrossing - see Figs. 2.28 and 2.29 and their discussion. The translation of this fact to the 
Josephson junction language (see, in particular, Eq. (2.256) and its discussion) shows that the values of 
the effective electric charge Q of the junction, on two anticrossing energy branches, differ by charge 2e 
of one Cooper pair. Since, according to Eq. (2.222) and its discussion, the system dynamics in this case 
is reduced to the interaction of these two states with different Q, in application to quantum computation 
this system is called the charge qubit. Unfortunately, the states of such qubit are rather sensitive to 
random charged impurities injunction’s vicinity, causing strong fluctuations, and hindering its control, 
so this option is not actively pursued nowadays. 


Other options are based on the modification of potential U(tp) at Josephson junction 
incorporation into superconducting loops, i.e. in SQUIDs. 69 In the simplest case of a single loop of 
inductance L closed by one junction with critical current Ic, the total potential energy of the system in an 
external magnetic field is 70 


U(cp) = E J 


(^~^ext ) 2 

2 /?, 


cos (p 


• , ^ Mr 
with E , = c 


2e 


a - c_L 

Pl n 


(8.155) 


where <p C xt is proportional to the external magnetic flux O cxt through the loop. According to this relation, 
at Ej» e~/2C (corresponding to the tight-binding limit of the energy band theory), one convenient way 
to implement a two-level system is to take the dimensionless inductance parameter (3 L above but very 
close to 1 (0 < fa ~ 1 « 1), the “symmetrizing” magnetic field {(p ex t « ri), and Ej « (e 2 /C)/(/f - l) 3 . In 
this case, the potential profile has the shape of a nearly symmetrical double well, with ground states in 
each well coupled by tunneling through a relatively low tunnel barrier, creating a pair of eigenstates 
with relatively low eigenenergy splitting A = E\ - Eq « Ej (Fig. 6a). 


(a) 




(b) 

Fig. 8.6. Typical potential 
profiles and energy levels of 
SQUID-based qubits: (a) “flux 
qubit” and (b) “phase qubit”. 
Red dashed lines show 
eigenenergies of the used 
-> states 0 and 1 . 

<P 


68 For the purposes of Ej control reasons, it is more convenient to use two-junction configurations called Bloch 
transistors. Unfortunately, I do not have time to go into these details. 

69 See, e.g., EM Sec. 6.4 and references therein. 

70 This expression directly follows from combining EM Eqs. (6.57), (6.59), and (6.70). 
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Such flux qubits have a relatively large magnitude |®io| = |©oi| of the matrix elements of the 
operator of magnetic flux © = (title) (p piercing the SQUID loop. This certainly makes the arrangement 
of necessary coupling between flux qubits (see, e.g., Eq. (149) and its discussion) very easy, despite the 
macroscopic (~10 pm) sizes of SQUIDs and hence of the distances between them, decreasing the time T 
~ Tit k necessary for the most critical two-bit (e.g., CNOT) operations, to a just few nanoseconds. 
However, the large flux matrix elements also increase the undesirable coupling of such qubits to 
dephasing environment, and hence decrease dephasing time I) - typically, to just a few tens or hundreds 
nanoseconds, uncomfortably close to T. 

This coupling may be decreased, leading to a substantial increase of I 2 (up to a few 
microseconds) by moving the bias phase tp ex t away from the symmetrizing value n, i.e. using the 
asymmetric potential profde sketched in Fig. 6b. The working states 0 and 1 of such phase qubit, 
localized in a higher potential well (shown left in Fig. 6b), are actually metastable, but with a very long 
lifetime because of the relatively high barrier separating the wells. An additional benefit of this 
arrangement is that a fast lowering of the tunnel barrier causes the system in state 1 to tunnel into the 
lower well, with the sequential energy relaxation (see the arrows in Fig. 6b); this process be used for 
qubit state readout. A major problem of phase qubits is that the part of potential U(cp), in which qubit 
states are localized, is almost quadratic, so that the energy levels are nearly equidistant - cf. Eqs. 
(2.114), (6.15), and (6.22). 71 As a result, the external rf drive of frequency co= (E\ - Eo)/h, used to 
arrange the state transforms described by Eq. (146), may induce simultaneous undesirable transitions to 
(and between) higher energy levels. This effect may be mitigated by the rf drive amplitude reduction 
(see Problem 6.6), but at a price of the proportional increase of transfer time T, that may again become 
comparable to Ti. Despite this problem, phase qubits have been used for a successful experimental 
demonstration of the core single-operand and two-operand gates, and recently, for the reproduction of 
number 15 factoring “48% of the time”. 72 

(iv) Optical systems pose a special challenge for quantum computation: due to the virtual 
linearity of most electromagnetic media at reasonable light power, the implementation of interaction 
Hamiltonians, such as (149), is problematic. However, in 2001 a very smart way around this hurdle was 
invented. 73 In this KLM scheme, nonlinear elements are not needed, and quantum gates may be 
composed just of linear devices (such as optical waveguides, mirrors and beam splitters), plus single- 
photon sources and detectors. Unfortunately, a quantitative discussion of this scheme would require 
using the basics of quantum electrodynamics that will be discussed only in the next chapter. The work in 
this direction has already led to an experimental demonstration of factoring number 21 = 3x7 (which in 
some aspects is easier than that of 15). 74 

Let me, however, note that due to the statistical nature of Shore’s algorithm, and the so-far 
imperfect fidelity of qubit manipulations, all number factoring experiments carried out so far may be 
more fairly described merely as demonstrations of their result consistency with the (evident) 
mathematical facts. So, despite a very substantial research effort, the progress is rather slow, with the 


71 This is even more true for the so-called “transmons” (or “Xmons”) - the phase qubits versions in which a 
Josephson junction is just a part of an external resonator, providing it with small nonlineartity (anharmonism) - 
see, e.g., R. Barrens et al., Nature 508, 500 (2014) and references therein. 

72 E. Lucero et al., Nature Physics 8, 719 (2012). 

73 E. Knill et al.. Nature 409, 46 (2001). 

74 E. Martin-Lopez et al.. Nature Photonics 6, 773 (2012). 
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main culprit being the unintentional coupling of qubits to environment, leading most importantly to their 
state dephasing, and eventually to errors. (Another major problem of this research field is the lack of 
algorithms (besides Shor’s number factoring) that would give quantum computation a substantial 
advantage over classical counterparts, and hence a potential customer base broader that the 
communication encryption community, that could provide the necessary significant support.) 

Of course, some error probability exists in classical digital logic gates and memory cells as well. 
However, in this case, there is no conceptual problem with the device state measurement, so that the 
error may be detected and corrected in many ways; perhaps the simplest one is the so-called majority 
voting. For that, the input bit is reproduced in several (say, three) copies and sent to three similar 
devices whose outputs are measured and compared. If the output bits differ, at least one of the devices 
has made at error. The error may be not only detected, but also corrected by taking the two coinciding 
output bits for the correct one. If the probability of a single device error is W « 1, the probability of 
error of any device pair is close to W , and that of some pair (and hence of the whole majority voting 
scheme) is close to 3 W . Since for the currently dominating CMOS integrated circuits, W is very small, 
such error correction circuit creates a dramatic fidelity improvement - at the cost of higher circuit 
complexity (which may be traded for larger time delay) and consumed power. 

For quantum computation, the general idea of using several devices (say, qubits) for coding the 
same information remains the same; however, there are two major complications, both due to the analog 
nature of qubit states. First, as we know from Chapter 7, the dephasing effect of environment may be 
described as a slow random drift of coefficients a, in expansion (128), leading to the deviation of the 
output state a\- m from the basis form (132), and hence to a nonvanishing probability of wrong qubit state 
readout (Fig. 2). Hence the quantum error correction has to protect the result not only against possible 
random state flips 0 <-» 1 as in the classical digital computer, but also against these “creeping” analog 
errors. 

Second, the qubit state is impossible to copy exactly {clone) without disturbing it, as follows 
from the following simple calculation. 75 Cloning state a of one qubit to another qubit, initially in an 
independent state (say the basis state 0), means the following transformation of the two-qubit ket: |«0) 
— » | act). If we want such transfonn to be perfonned by a real quantum system whose evolution is 
described by a unitary operator u , and to be correct for an arbitrary state a, it has to work not only for 
both basis states of the qubit: 

m|00) = |00), m|10) = |11), (8.156) 

and also for their arbitrary linear combination (125). Since operator u has to be linear, we may use Eq. 
(156) to calculate 

u\a0) = u (a 0 1 0) + a, 1 1))| 0) = a 0 w|00) + a^jlO) = a 0 1 00) + 1 1 1) . (8.157) 

On the other hand, the desired result of cloning is 

| aa) = (o 0 |0) + o 1 |l))(fif 0 |0) + a 1 |l))= al |00) + a 0 aj(jlO)-i- |0l))+a 1 2 |l l) , (8.158) 

i.e. evidently different, so that, for an arbitrary a, 


75 Amazingly, this no-cloning theorem was discovered as late as in 1982 (independently by W. Wooters and W. 
Zurek, and by D. Dieks) - in the context of work toward quantum cryptography. 
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(8.159) No-cloning 
theorem 

showing that the qubit state cloning is indeed impossible. 76 

This problem may be circumvented in the way shown in Fig. 7a. Here the CNOT gate, whose 
action is described by Eq. (137), entangles an arbitrary input state (125) of the source qubit with a basis 
initial state of an ancillary qubit - frequently called ancilla. Using Eq. (137), we may readily calculate 
the output two-qubit state’s vector: 


u\au)±\ aaj ~ 



a ) N= 2 = 

0^ + 

1)) 0) = a 0 C 0 1) + a,C 1 0) = a 0 00) + a, 

ii). 


(8.160) 


Quasi- 

cloning 


We see that this circuit does perform operation (157), i.e. re-prescribes the initial source qubit’s 
expansion coefficients ao and a\ equally to two qubits, i.e. duplicates the input information, though in 
contrast with the “genuine” cloning, it changes the state of the source qubit. Such “quasi-cloning” is the 
key to virtually all quantum error correction techniques. 
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Fig. 8.7. (a) Quasi-cloning, and (b) detection and correction of dephasing errors in a single qubit. 


Consider, for example, the three-qubit circuit shown in Fig. 7b. At its input, the double 
application of the quasi-cloning produces an intermediate state A with the ket-vector 

|H) = a 0 |000) + a 1 |lll), (8.161) 

which is an evident generalization of Eq. (160). 77 Subjecting the source qubit to the Hadamard transform 
(138), we get three-qubit state B represented by vector 

|S) = aa Jj(|0} + |l>i00) + a l j_(|0}-|l»|ll). (8.162) 

Now let us assume that at this stage, the source qubit comes into a contact with a dephasing 
environment (in Fig. 7, symbolized by single-qubit “gate” cp) . As we know from Sec. 7.3, its effect 


76 This does not mean that several qubits cannot be put into the same, arbitrary quantum state - theoretically, with 
arbitrary precision. Indeed, they may be first set into their lowest-energy stationary states as was discussed above, 
and then driven into an arbitrary state (125) by exerting on them similar classical external “forces”. So, the no- 
cloning theorem pertains to only an unknown state a of a qubit. 

77 Such state is also the 3-qubit example of the so-called Greeenberger-Horne-Zeilinger (GHZ) states, which are 
frequently called the “most entangled” states of a system of N> 2 qubits. 
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(besides some inconsequential shift of common phase) may be described by a random mutual phase shift 
of the basis states: 78 


|0>— >e*^|0>, |l) — » e i<p \\). 
As a result, for the intennediate state C (see Fig. 7b) we may write 


C> 


= a n 



+ e ^|l))|00) + tfj 



(8.163) 


(8.164) 


At this stage, in this simple theoretical model, the coupling with environment is completely 
quenched (ahh, if this could be possible in reality! we would have quantum computers by now :-), and 
the source qubit is fed into one more Hadamard gate. Using Eqs. (138) again, for state D after this gate 
we get 

|D) = a 0 (eos^|0) + /sin^|l))|00) + a 1 (/sin^|0) + cos^|l))|ll) . (8.165) 

Now the qubits are passed through the second, similar pair of CNOT gates - see Fig. 7b. Using Eq. 
(137), for the ket-vector of the resulting state E we readily get expression 

| A) = a Q cos 1 000) + a 0 z sin 1 1 1 1) + Oj/ sin 1 01 1) + cosy?|l00) , (8.166a) 


which evidently may be grouped as 

| A) = (<7 0 |0) + oJl^cos^lOO) +(a t |0) + a 0 |l))z'sin^|ll) . (8.166b) 

This is already a rather remarkable result. It shows that if we measure the ancilla qubits at stage 
E, and both results corresponded to states 0, we may be 100% sure that the source qubit (which is not 
affected by the measurement!) is in its initial state even after the interaction with environment. The only 
result of an increase of this interaction (as quantified by the magnitude of phase cp) is the growth of the 
probability, 

W = sin 2 cp , (8.167) 

of getting the opposite result, which signals a dephasing-induced error in the source qubit. This implicit 
measurement, without disturbing the source qubit, is called quantum error detection. 

Even more impressive result may be achieved by adding to the circuit one more component, the 
so-called Toffoli (or “CCNOT”) gate, denoted by the rightmost symbol in Fig. 7b. This 3-qubit gate is 
conceptually similar to the CNOT gate discussed above, besides that it flips the basis state of its target 
qubit only if both basis states of its two source qubits are 1. (In the circuit shown in Fig. 7b, the former 
role is played by our source qubit, while the latter role, by two ancilla qubits.) According to its 


78 For example, in the Hilbert space of the qubit, the model Hamiltonian (7.70), which was explored in Sec. 7.3, is 
diagonal in the z-basis of states 0 and 1 , so that the unitary transform it provides during interval T is also diagonal, 

giving the phase shifts described by Eq. (163), with (p - —^ f{/ \dt . Let me emphasize again that Eq. (162) is 

H 0 

valid only if the interaction with environment is a pure dephasing, i.e. does not include the energy relaxation of 
the qubit or its thermal activation to the higher eigenstate - see Chapter 7. 
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definition, the Toffoli gate has no effect on the first parentheses in Eq. (166b), but flips the source 
qubit’s state in the second parentheses. The result may be factorized as follows, 



F ) = («0 

0 ^ + Uj 

l))(cos (p 

00 ) + i sin cp 

ip). 


(8.168) 


showing that now the source qubit is again fully unentangled from the ancilla qubits. Moreover, 
calculating the norm squared of the second operand, we get 


(cos (p(00 1 - i sin (p{\. 1 |)(cos (A 00) + i sin cp\ 1 1)) = cos 2 cp + sin 2 cp = 1 , 


(8.169) 


Quantum 

error 

correction 


so that the final state of the source qubit always, exactly coincides with its initial state. This is the 
famous miracle of quantum state correction, taking place “automatically” - without any qubit 
measurements, and for any random phase shift (p. 


The circuit shown in Fig. 7b may be improved by adding the Hadamard gate pairs, similar to that 
used for the source qubit, to the ancilla qubits as well. If dephasing is small in the sense that the W given 
by Eq. (167) is much less than 1, this modified circuit may provide substantial error probability 
reduction (to ~W) even if the ancilla qubits are also subjected to a similar dephasing and the source 
qubits, at the same stage - i.e. between two Hadamard gates. The perfect automatic correction of any 
error (not only inner dephasing of a qubit and its relaxation/excitation, but also the mutual dephasing 
between qubits) of any used qubit needs even more parallelism. The first circuit of that kind, based on 9 
parallel qubits, which is a natural generalization of the circuit discussed above, had been invented in 
1995 by the same P. Shor. Later, 5-qubit circuits enabling similar error correction were suggested. (The 
further parallelism reduction has been proved impossible.) 


However, all these results assume that the error correction circuits as such are perfect, i.e. 
completely isolated from the environment. In the real world this cannot be done. Now the key question 
is what maximum level W max of error probability in each gate (including those in the used error 
correction scheme) can be automatically corrected, thus opening a way toward large quantum computers 
producing some useful results - first of all, the factoring of large numbers - with at least 10 bits to be of 
interest for practice. To the best of my knowledge, this critical level has not yet been strictly calculated, 
partly because the error correction greatly inflates the number of the total gates in the system - by a 
factor crudely proportional to the number N of used qubits. Various authors give broadly different 
estimates: from W max ~10' to W max ~ 10'“. Whatever the critical level is, it has not been reached yet. 


This situation has motivated the search for the quantum computation schemes different from that 
shown in Fig. 2; the most prominent alternative is called adiabatic quantum computation , 79 In its most 
actively pursued option (for which “quantum system modeling” would be a more appropriate name), the 
interaction between a system of qubits is organized so that the system’s Hamiltonian is similar to that of 
some quantum system of interest. Then the qubit system, first prepared in a certain initial state with 
relatively high energy, e.g., in an unentangled state described by Eq. (130), is let to evolve on its own. 
Due to the unavoidable dissipation due to interaction with environment, the system eventually relaxes to 
a final unentangled state of its qubits, which is then measured. From numerous runs of such experiment, 
outcome statistics may be revealed for various temperatures of the environment. Thus, at this approach 
(which is very close to the numerical modeling technique called quantum annealing), the interaction 


79 Note that qualifier “quantum” is important here, to distinguish this research direction from the option of 
classical adiabatic (or “reversible”) computation - see, e.g., SM Sec. 3.3 and references therein. 
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with environment is allowed to play a certain role in the system evolution, though every effort is made 
to reduce it, to allow qubit “quantumness” to make a substantial difference at least at the beginning of 
the relaxation process. 

Generally speaking, adiabatic quantum computation may be used for perfonning any quantum 
algorithm, including number factoring. 80 Unfortunately, due to technical difficulties of the organization 
and precise control of long-range interaction in multi-qubit systems, 81 the list of modeled systems is 
presently limited to a few simple ID or 2D arrays described by the so-called extended quantum Ising 
(“spin-glass”) model 82 

H = -J -T h A j) > (8- : !70) 

UJ'} j 

where the curly bracket denotes the summation over pairs of close (though not necessarily closest) 
neighbors. Though Hamiltonian (170) is the traditional playground of phase transitions theory (see, e.g., 
SM Chapter 4), to the best of my knowledge there are not many practically valuable tasks that could be 
achieved by studying the statistics of its solutions. Moreover, even for this limited task, the speed of the 
best experimental adiabatic quantum “computer” with N = 108 qubits is still lower than that of a 
classical, off-the-shelf semiconductor processor (with a dollar cost lower by some 6 orders of 
magnitude), and no dramatic change of this comparison is predicted for realistic larger values of NP 

There may be better prospects for another application of entangled qubit systems, namely for 
telecommunication cryptography. 84 The goal here is to replace the currently dominating classical 
encryption, based on the public-key RSA code mentioned above, that may be broken by factoring of 
very large numbers, by a quantum encryption that would be fundamentally unbreakable. The basis of 
this opportunity are the measurement postulate and the no-cloning theorem: if a message is carried out 
by a qubit such as a single photon, it is impossible for an eavesdropper (in cryptography, traditionally 
called Eve ) to either measure or copy its faithfully, without also disturbing its state. However, as we 
have seen from the discussion of Fig. 7a, state quasi- cloning using entangled qubits is possible, so that 
the issue is far from being simple, especially if we want to use a publicly distributed quantum key, in 
some sense similar to the classical public key used at the RSA encryption. 

Unfortunately, I do not have time/space to discuss various options for quantum encryption, but 
cannot help demonstrating how counter-intuitive they may be, on the famous example of the so-called 
quantum teleportation (Fig. 8). 85 Suppose that party A (in cryptography, traditionally called Alice) 
wants to send party B {Bob) the full infonnation about the quantum state a of a qubit, unknown to either 
party. Instead of sending her qubit directly to Bob, Alice asks him to send her one qubit (J3) of the pair 
of other qubits, prepared in a certain entangled state, for example in the singlet state (11): 


80 See, e.g., the experiments on factoring of number 143 = 13x11, using nuclear spin relaxation, by N. Xu et al., 
Phys. Rev. Lett. 108 , 130501 (2012), though by the moment of this writing, their results remained controversial. 

81 Due to the same reason, the implementation is so far limited to most scalable, Josephson-junction (flux) qubits 
- see, e.g., M. Johnson et al.. Nature 473, 194 (2011). 

82 For its classical version, see, e.g., SM Eq. (4.23) and its discussion. 

83 See S. Boxio et al., Nature Physics 10 , 218 (2014) and T. Ronnow et al., arXiv:1401.2910 [quant-ph], 

84 This field was pioneered in the 1970s by S. Wisener. 

85 This procedure had been first suggested in 1993 by the same C. Bennett, and then repeatedly demonstrated 
experimentally - see, e.g., the recent paper by L. Steffen et al.. Nature 500, 319 (2013), and literature therein. 
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l/r)=TjH-|i°». < 817 » 

Using Eq. (125), the initial state of the whole 3-qubit system may be represented by the ket-vector 


|^) = (a 0 |0> + a 1 |l))^'> = %|OOl)-%|010) + ^L|010>-^L|lll> 


VI 1 


V2 1 ' VI 

which may be rewritten as a linear superposition, 

| a PP') = 1 1 <*P)] (- | °> + a o | !))+ a P )~ («i | °> + a 0 1 !>) 
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Fig. 8.8. Sequential stages of a quantum 
teleportation procedure: (a) the initial state with 
entangled qubits ft and ft’, (b) back transfer of 
qubit ft’, (c) measurement of pair aft, (d) forward 
transfer of 2 classical bits with the measurement 
result, and (e) the final state, with the state of 
qubit ft’ mirroring the initial state of qubit a. 


After having received qubit ft from Bob, Alice measures which of these 4 states does pair aft 
have. This may be achieved, for example, by measurement of one observable represented by operator 
and another one corresponding to cf. Eq. (148). 86 The measured eigenvalue of the 

former operator enables distinguishing the couples of states (173) with different values of the lower 
index, while the latter measurement distinguishes the states with different upper indices. 

Then Alice reports the result (that may be coded by just 2 classical bits) to Bob over a classical 
channel. Since the measurement places pair aft definitely in the corresponding state, the remaining 
Bob’s bit ft’ is now definitely in the unentangled single-qubit state that is represented by the 
corresponding parentheses in Eq. (172b). Note that each of these parentheses contains both coefficients 
ao,i, i.e. the whole information about the initial state of qubit a had initially. If Bob likes, he may now 
use appropriate single-qubit operations, similar to those discussed above, to move qubit ft into the state 
exactly similar to the initial state of qubit a. (This fact does not violate the no-cloning theorem (159), 
because the measurement has already changed the state of a.) This is of course a “teleportation” only in 


86 All four states (172) are eigenstates of both these operators, so that the measurements do not affect each other 
and may be done in any order. 
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a very special sense of this rather ambiguous term, but a good example of the importance of qubit 
entanglement’s preservation at their spatial transfer. For us, this is also a good primer for the 
forthcoming discussion of the EPR paradox and Bell’s inequalities in Sec. 10.1. 

Returning for a minute to practical quantum cryptography, since its two most common quantum 
key distribution protocols 87 require just a few simple quantum gates, whose experimental 
implementation is not a large obstacle, the main focus of the current effort is on decreasing single- 
photon dephasing in long optical fiber waveguides, 88 and hence increasing the maximum distance of 
quantum channels with sufficiently high qubit transfer fidelity. The recent progress was impressive, with 
demonstrated two lines (using either protocol) longer than 100 km, 89 and active plans for 560 km and 
700 km landlines and several satellite-based systems. Let me hope that if not the author, then the reader 
of these notes will see this technology accepted for practical secure telecommunications. 


8.6. Exercise problems 

2 2 

8.1 . N electrons are placed in a 3D, spherically-symmetric quadratic potential U(r) = mcoo^r /2. 
Neglecting the direct interaction of the electrons, find the ground-state energy of the system. 

8.2 . N » \ indistinguishable, non-interacting quantum particles are placed in a hard-wall, 
rectangular box with sides a x , a y , and a z . Calculate the ground-state energy of the system, and the 
average forces it exerts on each face of the box. Can we characterize the forces by certain pressure? 

Hint : Consider separately the cases of bosons and fermions. 

8.3 . Prove that the singlet state, and each triplet state of a system of two indistinguishable spin- 1 /-? 
particles, are eigenstates of the operator of the scalar product Si -Si of the spin vectors, and calculate the 
corresponding eigenvalues. Compare the results with the scalar product of two classical vectors of 
magnitude fill each, being either parallel or antiparallel. 

8.4 . The interaction of two, indistinguishable spin- l A particles (that are otherwise free) has the 

fonn 

«... =t/(r) + /(r)VS 2 , 

where r s n - r 2 is the distance between the particles. Reduce the problem to two independent wave 
mechanical problems. 

8.5 . Two similar spin- 1 /'? particles, with the gyromagnetic ratio y, localized at two points 
separated by distance a, interact via the field of their magnetic dipole moments. Calculate the 
eigenstates and eigenvalues of the system. 


87 BB84 suggested in 1984 by C. Bennett and G. Brassard, and EPRBE suggested in 1991 by A. Ekert. For 
details, see, e.g., either Sec. 12.6 in Nielsen and Chuang, or the review by N. Gizin et al., Rev. Mod. Phvs. 74, 145 
( 2002 ). 

88 For their discussion see, e.g., EM Sec. 7.8. 

89 See P. ttiskett et al., New J. Phys. 8, 1 93 (2006), and R. Ursin et al., Nature Physics 3,481 (2007). 
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8.6 . In the simple case of just two similar spin-interacting particles, distinguishable by their 
spatial location, the famous Heisenberg model of ferromagnetism 90 is reduced to the following 
Hamiltonian: 

H = -JSi ■ S 2 - y& • (si, + S 2 ), 

where J is the spin interaction constant, y is the gyromagnetic ratio of each particle, and S is the 
external magnetic field. Find the stationary states and eigenenergies of this system for spin-A particles. 

8.7 . Two distinguishable particles, both with spin Vi , but different gyromagnetic ratios y\ and yi, 
are placed into external magnetic field S. In addition, their spins interact as 

H int =-JS r S 2 . 

Find the eigenstates and eigenenergies of the system. 91 

8.8 . A system of 3 similar but distinguishable spin- A particles is described by the following 
Hamiltonian: 

H = -j(s i -S 2 +S 2 -S 3 +S 3 -S,), 

where J is the spin interaction constant. Find the stationary states and eigenenergies of this system. 

8.9 . For a system of three distinguishable spins-Vi, find the joint eigenstates (and the 
corresponding eigenvalues) of operators S z and S, where 

SsSj+S 2 +S 3 

is the vector operator of the total spin of the system. Do the corresponding quantum numbers s and m s 
obey Eqs. (5.197)? 

8.10 . Prove that Eq. (8.32) of the lecture notes indeed yields Eg l) = (5/4)fs H - 

8.1 1 . For a diluted gas on helium atom in their ground state, with n atoms per unit volume, on 
density n, calculate its: 

(i) electric susceptibility % e , and 

(ii) magnetic susceptibility % m , 

and compare the results. 

8.12 . Represent the operators of the total kinetic energy and the total orbital angular momentum 
of a system of two particles, with masses ni\ and m 2 , as combinations of terms describing their center-of- 
mass motion and relative motion. 

8.13 . Two particles, with masses m\ and m 2 , interact as described by 3D potential 


90 For more discussion of this and other models of ferromagnetism and antiferromagnetism see SM Chapter 4. 

91 For similar particles (in particular, with y\ = y 2 ) the problem is reduced to the previous one. 
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U(r l ,r 2 )='^(r l -r 2 ) 2 , 

but otherwise are free to move. Calculate the energy spectrum and the degeneracy of each energy level 
of the system for the cases when the particles are: 

(i) distinguishable, and 

(ii) indistinguishable spin-Vi fermions (such as electrons). 


8.14 . Two particles with similar masses m and charges q are free to move along a round, plane 
ring of radius R. In the limit of their strong Coulomb interaction, find the lowest eigenenergies of the 
system, and sketch the system of its energy levels. 

8.15 . Two similar ID, spin- 14 particles are attracting each other at contact: 

U(x l ,x 2 ) = ~x 2 ), with"^>0, 

but are otherwise free to move. Find the energy and the wavefunction of the ground state of the system. 
Hint : Mind the possibility of various spin states of the particles. 


8.16 . Two indistinguishable, ID, spin- 14 particles in a triplet spin state are attracting each other 
at limited distance: 


U(x i,x 2 ) 


J- Vo, 

for |xj - x 2 < a / 2, 

1 0, 

otherwise, 


with U 0 > 0 , 


but are otherwise free to move. How large should be a for the system to have at least one localized 
eigenstate? Relate the result to the solution of the previous problem. 


8.17 / Two indistinguishable spin- 14 particles are confined to move around a circle of radius R, 
and interact only at a very short distance l = Rep = R(cp\ - epi) between them, so that the interaction 
potential U may be well approximated with a delta-function of ep. Calculate their lowest ground states 
and their energy for the following two cases: 

(i) “orbital” (spin-independent) interaction: U = 'WS{ep ) , 

(ii) spin-spin interaction: U = -"ZdS, •S 2 c5'(^), 

both with constant '^ > 0. Analyze the trends of your results in the limits > 0 and > go. 


8.18 . Low-energy spectrum of many diatomic molecules may be well described modeling the 
molecule as a system of two spinless particles connected with a light and elastic, but very stiff spring. 
Calculate the spectrum in this approximation. 


8.19 . Two particles of mass M, separated by two much lighter particles, 
of mass m « M, are placed on a ring of radius R - see Fig. on the right. The 
particles strongly repulse at contact, but otherwise each of them is fee to move 
along the ring. Calculate the lower part of the energy spectrum of the system. 
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8.20 . Use the perturbation theory to calculate the contribution to the hyperfine splitting of the 
ground energy of the hydrogen atom, due to the interaction between spins of the nuclei (proton) and of 
the electron. 

Hint : The proton’s magnetic moment operator is described by Eq. (4.116), with a positive 

8 11 

gyromagnetic factor y p = g p e/2m p ~ 2.675x10 s' T" , whose magnitude is much smaller than that of the 
electron (\y e \ « 1.761xlO n s^T' 1 ), due to a different g-factor, g p « 5. 586, 92 and of course a much higher 
mass, m p « 1.673xl0' 27 kg. 

8.21 . Discuss the factors +1/V2 that participate in Eqs. (19) and (21), in terms of the Clebsh- 
Gordan coefficients discussed in Sec. 5.7. 

8.22 . Compose the simplest model Hamiltonians of the following systems, in terms of the second 
quantization formalism: 

(i) a system of two weakly coupled quantum wells, taking into account pair on-site interactions 
(additional energy J per each pair of particles in the same quantum well), and 

(ii) same for the motion in a periodic ID potential, in the tight-binding limit. 

8.23 . For each of the Hamiltonians composed in the previous problem, derive the Heisenberg 
equations of motion for particle creation operators, for (i) bosons, and (ii) fermions. 

8.24 . Express the ket-vectors all possible Dirac states for the system of 3 indistinguishable 

(i) bosons, and 

(ii) fermions, 

via those of their single-particle states. 

8.25 . Explain why the Hartree-Fock approximation (118), applied to the 4 He atom, gives 
“correct” 93 expression (31) for the ground singlet state, and correct Eqs. (44)-(45) (with the minus sign 
in the former relation) for the excited triplet state, but cannot describe result (44), with the plus sign, for 
the excited singlet state. 

8.26 . Find a time-independent Hamiltonian that may cause the qubit evolution described by Eq. 
(147). Discuss the result and its relation to the time-dependent Hamiltonian (6.86). 


92 The anomalously large value of its g-factor may be qualitatively understood as a result of the three-quark 
structure of this particle. (The exact quantitative calculation of g p still remains a challenge for quantum 
chromodynamics.) 

93 Correct in the sense of the 1 st order of the perturbation theory. 
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Chapter 9. Introduction to Relativistic Quantum Mechanics 

This chapter gives a brief introduction to relativistic quantum mechanics. It starts with a discussion of 
the basic elements of the quantum theory of electromagnetic field ( quantum electrodynamics, QED), 
including the quantization scheme, photon statistics, radiative atomic transitions, the spontaneous and 
stimulated radiation, and the so-called cavity QED. Then I will briefly review the relativistic quantum 
theory of particles with nonvanishing rest mass, notably Dirac ’ theory of spin-V 2 particles, and mark the 
point of entry into the most complete relativistic quantum theory - the quantum field theory, QFT - 
which is beyond the scope of these notes . 1 


9.1. Electromagnetic field quantization 

Classical mechanics tells us 2 that the relativistic relation between momentum p and energy E of 
a free particle with rest mass rn may be simplified in two limits, nonrelativistic and ultrarelativistic: 


Free 

particle’s 

relativistic 

energy 


E = 

(pcf + (me 2 ) 2 

1/2 

A 

mc J + p 2 / 2 rn, for p « me, 
pc, for p » me . 


(9.1) 


In both limits, the transfer from classical to quantum mechanics is easier than in the arbitrary case. Since 
all the previous part of this course was committed to the first, non-relativistic limit, I will now jump to a 
brief discussion of the ultrarelativistic limit p » me, for a particular but very important system - the 
electromagnetic field. Since the excitations of this field, called photons, are currently believed to have 
zero rest mass my the ultrarelativistic limit is valid for any photon energy E, and the quantization 
scheme is rather straightforward. 


As usual, the quantization has to be based on the classical theory of the system, in this case the 
Maxwell equations. As the simplest case, let us consider electromagnetic field in a free-space volume 
limited by ideal walls that reflect incident waves perfectly. 4 Inside the volume, the Maxwell equations 
may be reduced to a simple wave equation 5 for electric field 


V 2 <? 


dfe 
2 dt 2 


(9.2) 


and an absolutely similar equation for magnetic field 3. We may look for the general solution of Eq. (2) 
in the variable-separating form 


1 Note that some material of this chapter is frequently taught as a part of the QFT. I will focus on a few most 
important results that may be obtained without starting heavy QFT engines. 

2 See, e.g., EM Chapter 9. 

3 By now this fact has been verified experimentally with an accuracy of at least ~l 0" 22 rn e - see S. Eidelman et al., 
Phys. Lett. B 592, 1 (2004). 

4 In the case of finite energy absorption in the walls, or in the wave propagation media (say, described by complex 
constants s and fi), the system would not be energy-conserving (Hamiltonian), i.e. would interact with the 
dissipative environment. Specific cases of such interaction will be considered in Sections 2 and 3 below. 

5 See, e.g., EM Eq. (7.3), for the particular case e = £q, p = pa, v 2 = Hep = MsojUo = c 2 . 
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^(r,0 = X^(Oe 7 (r). (9.3) 

j 

Physically, each term of this sum is a standing wave whose spatial distribution and polarization 
(“mode”) is described by vector function e/r), and the temporal dynamics, by function pfi). Plugging an 
arbitrary term of this sum into Eq. (2), and separating variables exactly as we did, e.g., for the 
Schrodinger equation in Sec. 1.4, we get 

V 2 e, 1 p, 

- = — — = const s -k (9.4) 

e , c Pj 


so that the spatial distribution of the mode satisfies the 3D Helmholtz equation : 

V 2 e 7 + kjtj =0. 


(9.5) 


Equation 
for spatial 
distribution 


The set of solutions of this equation, with appropriate boundary conditions, determines the set of 
functions e ; and simultaneously the spectrum of wave number moduli kj. The latter values determine 
mode eigenfrequencies, following from Eq. (4): 


Pj + afjPj = 0, with cOj = k f . 


(9.6) 


There is a big philosophical difference between the approaches to equations (5) and (6), despite 
their single origin (4). The first (Helmholtz) equation may be rather difficult to solve in realistic 
geometries, 6 but it remains intact in quantum theory, with the scalar components of vector functions 
e/r) still treated (at each point r) as onumbers. In contrast, Eq. (6) is readily solvable (giving sinusoidal 
oscillations with frequency <x>j), but this is exactly where we can make a transfer to quantum mechanics, 
because we already know how to quantize a mechanical ID harmonic oscillator that obeys, in classics, 
the same equation. 


As usual, we need to start with the appropriate Hamiltonian corresponding to the classical 
Hamiltonian function H of the proper set of generalized coordinates and momenta. The electromagnetic 
field’s Hamiltonian function (that in this case coincides with field’s energy) is 7 


H = 



>2 A 


+ - 


2 /f 


o J 


(9.7) 


Let us represent the magnetic field in a fonn similar to Eq. (3), 


3{r,t) = -Y J ^ j q ] {t)b j {r). (9.8) 

j 

Since, according to the Maxwell equations, in our case the magnetic field satisfies the equation similar 
to Eq. (2), the time-dependent amplitude qj of each of its modes obey the equation similar to Eq. (6), i.e. 
also changes in time sinusoidally, with the same frequency COj. Plugging Eqs. (3) and (8) into Eq. (7), we 
may recast it as 


6 See, e.g., various problems discussed in EM Chapter 7, especially in Sec. 7.9. 

7 See, e.g., EM Sec. 9.8, in particular, Eq. (9.225). I am using use SI units, with sqju 0 = c' 2 ; in the Gaussian units, 
coefficients £q and po disappear, but there is an additional common factor HAti in the equation for energy. If we 
modify the normalization conditions accordingly, all the subsequent results look similar in any system of units. 
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H = f e 0 e 2 j(r)d 3 r + °^\ — b*(r)d V, (9.9) 

T 2 2 Vo 

Since the distribution of constant factors between two multiplication operands in each tenn of Eq. (3) is 
arbitrary, we may fix it by requiring the first integral in Eq. (9) to equal 1 . It is straightforward to check 
that according to the Maxwell equations, which give a specific relation between vectors & and ®, x this 
normalization makes the second integral in Eq. (9) equal 1 as well, and Eq. (9) becomes 

^ p) co)q ) 

H = Y. h p h i 2 + 2 ^ <910) 

Now we can carry out the standard quantization procedure, namely declare Hj, pj, and qj the 
quantum-mechanical operators related exactly as in Eq. (10), 

Electro- 
magnetic 
mode’s 
Hamiltonian 



(9.11) 


we see that this Hamiltonian coincides with that of a ID harmonic oscillator with the mass rrij formally 
equal to l, 8 9 and the eigenfrequency equal to op Now, in order to plug Eq. (11) into Eq. (4.199) for the 
time evolution of Heisenberg-picture operators p . and q j} we need to kn ow the commutation relation 

between these operators. For that, returning to the classical case, let us calculate the Poisson bracket 
(4.204) for “functions” A = qj' and B =pj< 




V dPj" 
v dp , Sq , 


oq r dp r ' 
V dPj , 


(9.12a) 


Since in the classical Hamiltonian mechanics, all generalized coordinates q/ and momenta pj have to be 
considered independent arguments of H, only one tenn (with j = j ’ = j”) in only one sum (12) (with j ’ = 
j ”), gives a nonvanishing value (-1), so that 



(9.12b) 


Hence, according to the general quantization rule (4.205), the commutation relation of the operators 
corresponding to qj and pf is 

[g f ,p r ]=ihS fl „ (9.13) 


i.e. is exactly the same as for the usual Cartesian components of the radius-vector and momentum of a 
mechanical particle. 

As the reader already knows, Eqs. (1 1) and (13) open for us several alternative ways to proceed: 


8 See, e.g., EM Eq. (7.6). 

9 With different normalizations of functions e ; (r) and b y (r), we could readily arrange any value of rrij, and the 
choice corresponding to rrij = 1 is the best one just for the notation simplicity. Note also that I am using notation qj 
instead of Xj for the generalized coordinate of the field oscillator, in order to emphasize the difference between the 
former variable, defined by Eq. (8), and one of the Cartesian coordinates, i.e. one of arguments of c-number 
functions e and b. 
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(i) Use the Schrodinger-picture wave mechanics based on wavefunctions K Vj(q h t). As we know 
from Sec. 2.10, this way is inconvenient for most tasks, because eigenfunctions of the harmonic 
oscillator are rather clumsy. 

(ii) A substantially better way is to write the equations of time evolution of the Heisenberg- 
picture operators q, (t) and p: (t) . 


(iii) An even more convenient approach is to use equations similar to Eqs. (5.99) to decompose 

A _ /v i* /v 

operators q^t) and /y (t) into the creation-annihilation operators aj and a j , and work with these 
operators using either the Schrodinger or the Heisenberg picture, depending on the problem. 


I will mostly use the last route. Replacing m with in, = I , and co o with CDj, the last forms of Eqs. 
(5.98) become 


a i = 


'a,,' 1 ' 2 ' 

K 2h, 


q, + l - 


.Pi 


CO 


-t 

a ) = 


i 


'co^ 2 ' 

v2 


q t ~ l - 


.Pi 


m jj 


(9.14) 


and due to Eq. (13), the creation-annihilation operators obey the commutation similar to Eq.(5.101), 


t 


a j, a j, 


= I5 jf , 


(9.15) 


so that, according to Eqs. (3) and (8), the quantum-mechanical operators corresponding to the electric 
and magnetic fields are 


h* ,o=*X 


fico 

V 2 y 


e,(r) | a] -dj |, 


®(r ,0 = X 


ft CO ; 

V 2 y 


b / O') | dj +dj |, 


and Eq. (11) for / h mode’s Hamiltonian becomes 


H j = fico j 


1 ? 

a) a : + —1 

j j 2 


= ftco, 


77 ; + —I 
1 2 


, with ii = d \ d , , 

J J J ■ 


(9.16a) 

(9.16b) 


Electro- 

magnetic 

fields’ 

operators 


(9.17) 


absolutely similar to Eq. (5.505) for a mechanical oscillator. 

Now comes a very important conceptual step. From Sec. 5.4 we kn ow that eigenstates (Fock 
states 72 j) of Hamiltonian (17) have energies 



( n 

E . = hco ; 

J J 

l 7 2 J 



Electro- 

magnetic 

( 9 . 18 ) mode’s 
eigen- 
energies 


and, according to Eq. (5.115), operators a) and a act on the eigenkets of these states as 


a j | n i ) = ( n j )‘ /2 1 n j ~ 1 )> | n j ) = ( n j + l J' 2 1 n j + l ) ■■ 


(9.19) 
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regardless of the quantum states of other modes (frequently called field oscillators). These rules 
coincide with definitions (8.56) and (8.60) of bosonic creation-annihilation operators, and hence their 
action may be considered as the creation/annihilation of certain bosons. Such a “particle” (actually, an 
excitation of an electromagnetic field oscillator) is exactly what is, strictly speaking, called a photon. 
Note immediately that according to Eq. (16), such an excitation does not change the spatial distribution 
of the / h mode of the field. So, such a “global” photon is an excitation created simultaneously at all 
points of the field confinement region. 

If this picture is too contrary to the intuitive image of a particle, please recall that we had a 
similar situation in Chapter 2 with eigenstates of the nonrelativistic Schrodinger equation: the 
represented a standing de Broglie wave existing simultaneously in all points of the particle confinement 
region. The (partial :-) reconciliation with the classical picture of a moving particle might be obtained by 
using the linear superposition principle to assemble a quasi-localized wave packet of sinusoidal waves, 
with close wave numbers. Very similarly, we may fonn a quasi-localized wave packet using a linear 
superposition of the “global” photons with close values of kj (and hence of). An additional simplification 
here is that since the dispersion relation for electromagnetic waves is linear: 

8a>. 8 2 a> i 

= c = const, i.e. f = 0, (9.20) 

8k j 8k j 

so that, according to Eq. (2.39a), the electromagnetic wave packets (localized photons) do not spread out 
during their propagation. 

The next important conceptual issue is that of the ground-state energy. Equation (18) implies that 
the total ground-state (i.e., the lowest) energy of the field is 


(9.21) 


This sum diverges at high frequencies for any realistic any realistic model of the field-confining volume 
- either infinite or not. Any attempt to dismiss this paradox by declaring the zero-point energy 
unobservable and hence non-existing fails due to several experimental facts. 

First of all, the ground-state “fluctuations” (sometimes called “quantum noise”) can be directly 
observed - see Sec. 7.5 and in particular the literature cited therein. Second, there is the Casimir 
effect . 10 The simplest manifestation of the effect involves two parallel plates separated by a vacuum gap 
of thickness d « A , where A is the plate area (Fig. 1). Rather counter-intuitively, the plates attract 
each other with a force proportional to area A, and rapidly increasing at the decrease of gap d. 


Ground- 
state 
energy 
of the field 



t d 


Fig. 9.1. Generic geometry of the Casimir effect 
manifestation. 


10 It was predicted in 1948 by H. Casimir and D. Polder, and confirmed semi-quantitatively in experiments by M. 
Spamaay, Nature 180 , 334 (1957) and others. A decisive error bar reduction (to about ~5%), providing a 
quantitative confirmation of the Casimir formula (23), was achieved by S. Lamoreaux, Phys. Rev. Lett. 78 , 5 
(1997) and U. Mohideen and A. Roy, Phys. Rev. Lett. 81 , 004549 (1998). 


Chapter 9 


Page 5 of 36 


Essential Graduate Physics 


QM: Quantum Mechanics 


The effect’s explanation is that the energy of each the electromagnetic field mode, including the 
ground-state energy, is intimately related with the average pressure, 



(9.22) 


exerted by the field on the walls constraining it to volume V. While its pressure on the external surfaces 
on the plates is due to sum (21) over all free-space modes, with arbitrary values of k z (the z-component 
of the wave vector k y ), between the plates the spectrum of k : is limited to multiples of n/d, so that the 
pressure on the internal surfaces is lower. The net pressure may be found as the sum of contributions 
(22) from all “missing” low-frequency modes in the gap. The calculations are rather simple if the plates 
are made of an ideal conductor (which provides boundary conditions E n = 0 and B r = 0 on the plate 
surfaces), and the result is 11 


(9 . 23) Casinir 


Note that for this summation, the high-frequency divergence of Eq. (21) at high frequencies is 
not important, because it participates in the forces exerted on all surfaces of each plate, and hence 
cancels out from the net pressure. In this way, the Casimir effect not only gives a confirmation of Eq. 
(21), but also teaches us an important lesson how to deal with the divergence of this sum at coj — > oo: just 
get accustomed to the idea that the divergence exists and ignore the fact while you can. However, for 
more complex tasks of quantum electrodynamics (and quantum theory of any other field) this approach 
becomes impossible, and then more complex, renormalization techniques become necessary. For their 
study, I have to refer the reader to a quantum field theory course - see the literature cited in the end of 
this chapter. 


(. p) = Y j (p, ) 

7t 2 TlC 

\ / J 

j 

240 d 4 


9.2. Photon statistics 

As a matter of principle, the Casimir effect may be used to measure not only the free-space 
electromagnetic field, but also that arriving from local sources - lasers, etc. However, usually this is 
done by simpler detectors in which the absorption of a photon by a single atom leads to its ionization. 
This ionization, i.e. emission of a free electron, triggers a chain reaction (i.e., an electric discharge in a 
Geiger-type counter) that may readily be registered by appropriate electronic circuitry. In order to 
discuss the statistics of such photon counts, it is sufficient to consider the field interaction with just one, 


11 For realistic metals, the reduction of d below ~1 pm causes significant deviations from this simple model, and 
hence from Eq. (23). The reason is that at the important frequencies co ~ c/d, the depth of field penetration into the 
metal (see, e.g., EM Secs. 2.1 and 6.2) becomes comparable with d, and a theory of the Casimir effect has to 
involve a certain model of field penetration. (It is curious that in-depth analyses of this problem, pioneered in 
1956 by E. Lifshitz, have revealed a deep relation between the Casimir effect and the long-range London 
dispersion forces which were the subject of Problems 3.7, 5.10 and 6.8 - for a review see, e.g., either I. 
Dzhyaloshinskii et al., Sov. Phys. Uspekhi 4 , 153 (1961), or K. Milton, The Casimir Effect, World Scientific, 
2001.) Recent experiments in the 100 nm - 2 pm range of distances d, with accuracy better than 1%, allowed even 
to distinguish the difference between alternative approximate models of field penetration - see D. Garcia-Sanchez 
et al, Phys. Rev. Lett. 109 , 027202 (2012). 
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“trigger” atom. The atom’s size a is typically much smaller that the radiation wave length Aj = 2 ntkj, so 
that their interaction is adequately described in the electric dipole approximation, 




(9.24) 


where d is the dipole moment’s operator. 12 In Sec. 6.5 we have already developed an approach suitable 
for the analysis of this problem, based on the Golden Rule - see Fig. 6.14 and Eq. (6. 152). 13 In our 
current case, we may associate system b with the “trigger atom” (whose ionized states form a continuum 
spectrum), and hence operator d in Eq. (24) with operand B in Eq. (6.148), while the electromagnetic 
field is represented by system a, and its electric field operator A is associated with operand A in that 
relation. Let us assume, for simplicity, that our field consists of only one mode e/r ). 14 Then we can 
keep only one tenn in Eq. (16a), and drop index j, so that Eq. (6.152), for the transition from certain 
initial state ini to a final state fin may be rewritten as 


T = 




2 

Pf 


2 n hco 


(fin 


A f J 

a' -a e(r) ini 


fin | 


r • n 



Pf ’ 


(9.25) 


where e(r ) is the local magnitude of vector e(r), and n e = e(r)/e(r) is its local direction. 15 As a reminder, 
in the Heisenberg picture of quantum mechanics, the initial and final states are time-independent, while 
the creation-annihilation operators are functions of time. In this Golden Rule fonnula, as in any 
perturbation result, this time dependence has to be calculated ignoring the perturbation - in this case the 
field-atom interaction. For the field’s creation-annihilation operators, this dependence coincides with 
that of the usual ID oscillator - see Eq. (5. 171), in which co o should be now replaced with co : 

a(t) = a( 0)e~ im , a f (t) = a f ( 0)e +iwt . (9.26) 


Hence Eq. (9.25) becomes 


T = nco 


fin II o^( 0)e IUJt -a( 0)e 1,01 \e(r)\ini 


icot 


-icot 


(fm\d(t)-n e 


ini 


P fin 


(9.27a) 


Now let us multiply the first bra-ket by exp {icot}, and the second one by exp {-icot}: 


12 As a reminder: this relation, with the single-particle expression d = qr, has already been used several times - 
see, e.g., Eqs. (6.32) and (6.149). In contrast to the former of those cases, now we have to account for the 
quantum nature of the electromagnetic field A so in Eq. (24) it is represented by the (vector) operator (16a). 

13 Please note that (as was promised) we have gradually slipped to the analysis of open, irreversible systems, with 
the detector(s) playing the role of a continuous-spectrum environment for the quantized electromagnetic field. 

14 In a multimode field, the modes are typically incoherent, so that the total transition rate may be calculated as 
the sum of the partial rates of each mode - as we will do for a certain case below. 

15 By the way, this expression shows that for the single-particle transitions from the ground state to n h Fock state, 
the absorption rate is indeed proportional to the oscillator strength f n = (2 m/h 2 )(E„ - E 0 ) |(«|x|0>| 2 of the transition, 
where x is particle’s coordinate in the direction of the external field. As was discussed in Chapter 5, the strengths 
obey the Thomas-Reiche-Kuhn sum rule I,/„ = 1 • 
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{fin |d(0- 


n e 


-io)t\ 


ini ) 


Pfin 


(9.27b) 


The physical sense of this, mathematically trivial, operation is that at resonant photon absorption, only 
the annihilation operator gives a significant time-averaged contribution to the first bra-ket matrix 
element. (Similarly, according to Eq. (4.199), the Heisenberg operator of the dipole moment, 
corresponding to the increase of atom’s energy, has only the Fourier components that differ from co 
only by ~T « co, so that its time dependence compensates the additional factor in the second bra-ket of 
Eq. (27b), so that this bra-ket is also frequency-independent and has a substantial time average.) Hence, 
we can neglect the fast-evolving term in the first bra-ket whose average over time interval — 1/T is very 
close to zero. 16 

Now let us assume that we use the same detector, characterized by the same second bra-ket and 
the same state density pf m , for measurement of various electromagnetic fields - or just the same field at 
different points r. Then we are only interested in the behavior of the first, field-related factor, and may 
write 


Toe 


fin | ae( r)| ini ') | = {fin | oe(r)| ini)(fin | ae( r)| ini') = { ini | fie (r)\fin){fin\ae(r)\ini) , (9.28) 


where the creation-annihilation operators are assumed to be taken in the initial moment (i.e., in the 
Schrodinger picture), and the initial and final states are those of the field alone. As we know, any ID 
harmonic oscillator (and hence the electromagnetic field oscillator) has many equidistant levels, so even 
if it initially was in a certain state, it may undergo be several coherent transitions to different finite Fock 
states. If we want to calculate the total rate, we may sum the transition rates into all finite states. Then, 
since these states form a full and orthonormal set, we may use the closure condition (4.44) to get 

Photon 

(9.29) counting 

rate 

Let us apply this formula to several possible quantum states of the field mode. 

(i) First, as a sanity check, the ground initial state ( n = 0) gives no photon counts at all. The 
interpretation is easy: the ground state cannot emit a photon that would trigger an atom in the counter. 

Again, this does not mean that the ground-state motion is not observable (if you still think so, please 
review the Casimir effect discussion in the last section), just that it cannot ionize an atom in the detector 
- because it does not have any spare energy for doing that. 

(ii) All other coherent states (Fock, Glauber, squeezed, etc.) of the field oscillator give the same 
counting rate, provided that their (n) is the same. This result may be less evident if we apply Eq. (29) to 
an interference of two light beams from the same source (say, in the double-slit or the Bragg-scattering 
configurations). In this case we may present the spatial distribution of the field as a sum 

e(r) = ei(r) + e 2 (r) . (9.30) 

Here each term describes one possible wave path, so that the field product in Eq. (29) may be a rapidly 
changing function of the detector position. For this configuration, our result (29) means that the 


T oc y (ini 

fin 

fie (r) fin)(fin ae( r) ini) = (ini 

/v i* /v 

a ' a 

ini) e* (r)e(r) = («)_ e(r) 2 . 


16 This is essentially the same rotating wave approximation (RWA) which was already used in Sec. 6.3 - see the 
transition from Eq. (6.90) to the first of Eqs. (6.94). 


Chapter 9 


Page 8 of 36 


Essential Graduate Physics 


QM: Quantum Mechanics 


interference pattern (and its contrast) are independent of the particular state of the electromagnetic 
field’s mode. 

(iii) Surprisingly, the last statement is also valid for a classical mixture of the different 
eigenstates of the same field mode, for example for its thermal-equilibrium state. Indeed, in this case we 
need to average Eq. (29) over the corresponding classical ensemble, but it would only result in a 
different meaning of averaging n in that equation; the field part describing the interference pattern is not 
affected. 

The last result may look a bit counter-intuitive, because common sense tells us that the 
stochasticity associated with thermal equilibrium has to suppress the interference pattern contrast. These 
expectations are (partly :-) justified, because a typical thermal source of radiation produces many field 
modes j, rather than one mode we have analyzed. These modes may have different wave numbers kj and 
hence different field distribution functions e/r), resulting in shifted interference patterns. Their 
summation would indeed smear the interference, suppressing its contrast. 

So the use of a single photon detector is not a suitable way to distinguish different quantum 
states of an electromagnetic field modes. This task, however, may be achieved using the photon 
counting correlation technique shown in Fig. 2. 17 



Fig. 9.2. Photon counting correlation 
measurements. (The intensities of the 
split beams should be comparable, but 
not necessarily equal.) 


Second- 

order 

correlation 

function: 

definition 


In this experiment, the counter rate correlation may be characterized by the so-called second- 
order correlation function 18 of the counting rates, 


( 2) / , = ( r ! (Qr 2 (f-r)) 
(rj(o)(r 2 (0) ’ 


(9.31) 


17 It was pioneered as early as in the mid-1950s (i.e. before the advent of lasers!), by R. Hanbury Brown and R. 
Twiss. Their first experiment was also remarkable for the rather unusual light source they used - star Sirius! (It 
was a part of an attempt to improve astrophysics interferometry techniques.) 

18 The reader may be interested what is the first - order correlation function. It is usually defined as 


g {1) (t) = (r 2 ,t - r)) / 


^(r, , t)Z 1 (r, , t ) }( <^(r 2 , t)3 ' (r 2 , t) 


>t, 


1/2 


In the single-mode case, and the rotating-wave approximation, the function is proportional to the c-number 
product e(ri)e (r 2 ), with all creation-annihilation operators cancelled, i.e. is suitable for characterizing 
interference patterns (30), but not the quantum state of the electromagnetic field. 
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where the averaging may be carried out either over many similar experiments, or over time t, due to the 
ergodicity of the experiment (with a stationary light source). Using the nonnalized correlation function 
(31) is very convenient, because characteristics of the detectors and beam splitter drop our from this 
fraction. 

Very unexpectedly for the mid-1950s, Hanbury Brown and Twiss discovered that the correlation 
function depends on time delay rin the way shown schematically by the solid line in Fig. 3. It is evident 
from Eq. (31) that if the counting events are completely independent, g l2 l ( z) should be equal 1 - which is 
always the case in the limit r — > oo. Hence, the observed behavior at r — > 0 corresponds to the positive 
correlation of detector counts at small time delays, i.e. to a higher probability of the nearly-simultaneous 
arrival of photons to both counters. This effect is called the photon bunching. 



Fig. 9.3. Photon bunching (solid line) and 
antibunching for various n (dashed lines). The 
lines approach level g (2) = 1 at r — > oo (on the 
time scale depending on the light source). 


Let us use our simple single-mode model to analyze this experiment. Now the elementary 
quantum process, characterized by the nominator of Eq. (31), is the correlated triggering of two 
counters, at two spatial-temporal points {iq, t) and (r 2 , t - r}, by the same field mode, so that we need to 
make the following replacement, in the first of Eqs. (25): 

& (r, t) — » const x b{x x , t)b(x 2 ,t - r) . (9.32) 

Repeating all the manipulations done in the single-counter case, we get 

(rj(/)r 2 (/-r)) oc (ini\a(t)^ a(t - r)^ a(t - r)a(t)\ini) e* {r x )e* {r 2 )e{Y x )e{r 2 ). (9.33) 

Plugging this expression, as well as Eq. (29) for single-counter rates, into Eq. (31), we see that the field 
distribution factors (as well as the detector-specific bra-kets and the density of states pr m ) cancel, giving 
a very simple final expression 

(a^(t)a^ (t - r)a(t - t)a(t)) 

g m U)= 1 2 L , (9.34) 

where the averaging should be carried out, as before, over the initial state of the field. Still, the 
calculation of this expression for arbitrary r may be quite complex, because the relaxation of the 
correlation function to the asymptotic value g <2) (oo) in many cases is due to the interaction of the light 
source with environment, and hence requires the open-system techniques which were discussed in 
Chapter 7. However, the zero-delay value g <2) (0) may be calculated in a straightforward way, because 
the time arguments of all operators are equal, so that we may write 
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Zero- 

delay 

correlation 


g (2) ( 0) = 


A. A "|* A A \ 

a } a 1 aa) 



(9.35) 


Let us evaluate this ratio for the simplest states of the field. (Remember, we are working in the 
Schrodinger picture now.) 

(i) n th Fock state . In this case, it is convenient to act by the annihilation operators upon the ket- 
vectors, and by the creation operators, upon the bra-vectors, using Eq. (19): 


Photon 

anti- 

bunching 


g <2) (0) = 


(n\a^ aa\n} (n -2|[n(« -l)] 1/2 [«(n — l)] 1 | ” -2} 


A *j* a 

\2 

/ 1 Ll/2 1/2 

a 1 a 

n) 

(n — l\n n 


n{n -1) _ 1 

2 ~ ^ ‘ 
n n 


(9.36) 


We see that the correlation function at small delays is suppressed rather than enhanced - see the dashed 
line in Fig. 3. This photon antibunching effect has a very simple explanation: a single photon emitted by 
the wave source may be absorbed by just one of the detectors. For the initial state n = 1, this is the only 
option, and it is very natural that Eq. (36) predicts no simultaneous counts at r = 0. Despite this 
theoretical simplicity, reliable observations of the antibunching have not been carried out until 1977, 19 
due to the experimental difficulty of creating Fock states of electromagnetic field oscillators - see Sec. 4 
below. 


(ii) The Glauber state a . A similar procedure, but now using Eq. (5.155) and its Hermitian 
conjugate, (a\a ] ={a\a , yields 

Glauber 
field 
statistics 


for any parameter a. We see that the result is very different result from the Fock states, unless in the 
latter case n — » oo. (We know that the Fock and Glauber properties should also coincide for the ground 
state, but at that state the correlation function’s value is uncertain, because there are no photon counts at 
all.) 

(iii) Classical mixture . From Chapter 7, we know that such ensembles cannot be described by 
single state vectors, and require the density matrix w for their description. In particular, we can use the 
key Eq. (7.5) to write 

rp | /v A f A f A A I 

<«s> 

[Tr(wa ' a JJ 

The calculation is easy for an ensemble in thermodynamic equilibrium, because here the density 
matrix is diagonal in the basis of Fock states n - see Eqs. (7.23)-(7.25): 


g (2) (0) = 


I A *{* A t A A I 

a\a 1 a 1 aa\a 

A 1* A \ 2 

a a' a a) 


* * 

a a aa 

* 2 

(a a) 


= 1 , 


(9.37) 


H. J. Kimble et al., Phys. Rev. Lett. 39, 691 (1977). For a detailed review of phonon antibunching, see, e.g., H. 
Paul, Rev. Mod. Phys. 54, 1061 (1982). 


Chapter 9 


Page 11 of 36 


Essential Graduate Physics 


QM: Quantum Mechanics 


W nn = W n S n„'i K = J eX P “ 


A" 




Z^' 

«= o 


where A = cxp-j - 


fico 


(9.39) 


So, for the operators in the nominator and denominator of Eq. (38) we also need just the diagonal terms 
of the operator products that have already been calculated - see Eq. (36). As a result, we get 


C OO v 2 

2X« 

V«=o ) 


One of these sums is just the geometric progression, 


Z W n n(n - 1) Z £ n ( n ~ !) x Z ^ 

g (2 \ 0)-^L 


n = 0 


n = 0 


f OO y 

\n = o / 


= 7 

17=0 J- 
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n=0 1 ^ 

and the remaining two sums may be readily calculated by its differentiation over parameter A: 
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2A 1 


dA 2 l-A ( 1-/1 ) 3 


(9.40) 


(9.41) 


(9.42) 


and for the correlation function we get an extremely simple result independent of parameter A and hence 
of temperature: 

(9.43) 

[Al(\-A) 2 \ 

This is the exactly the photon bunching effect first observed by Hanbury Brown and Twiss (Fig. 
3). We see that in contrast to antibunching, this is an essentially classical (statistical) effect. Indeed, Eq. 
(43) allows a purely classical proof. In the classical theory, the counting rate is proportional to the wave 
intensity I, so that Eq. (3 1) is reduced to 


g <2) (0) = 



with / oc E 2 (t) oc E (O E (0 . 


(9.44) 


For a sinusoidal field, the intensity is constant, and g <2) (0) = 1. (This is also evident from Eq. (37), 
because the classical state may be considered as the Glauber state with a — » oo.) On the other hand, if 
intensity fluctuates (either in time, or from one experiment to another), the averages should be 
calculated as 


oo oo 

ll N ) = J w(I)I N dI, with J w{l)dl = 1, 
0 0 


(9.45) 


Photon 
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where w(I) is the probability density. For the classical (Boltzmann) statistics, the probability is an 
exponential function of the electromagnetic field energy, and hence its intensity: 

w(I) = Ce~ P 1 , where ft oc 1 / kfiT , (9.46) 

so that Eqs. (48) yield: 


oo 

J C exp{- pi }dl = 1, so that C = /?, 

o 

00 00 

(l N ) = J w{I)I N dI = c[ exp{- pi}l N dI 
0 0 


1 « 

P 0 


[up, for TV = 1 , 
[2 Ip 2 , for N = 2. 


(9.47) 


Plugging these results into Eq. (44), we get g (2) (0) = 2, in a complete agreement with Eq. (43). 20 


9.3. Spontaneous and stimulated emission 

In our simple model for photon counting, considered in the last section, trigger atoms of the 
photon counter absorbed light. Now let us have a look at the opposite process of spontaneous emission 
of photons by an atom in an excited state, still using the same electric-dipole approximation for the 
atom-to-field interaction. We may still use the Golden Rule for the model depicted in Fig. 6.14, but now 
the roles have changed: we have to associate operator A with the electric dipole moment of the atom, 
while operator B with the electric field, and the continuous spectrum of system b represents the 
plurality of the electromagnetic field modes into which the spontaneous radiation may happen. Since 
now the transition increases the energy of the electromagnetic field, after the multiplication of the field 
bra-ket by exp [i cot}, we may keep only the photon creation operator whose time evolution compensates 
this fast “rotation”. As a result, the Golden Rule takes the following form: 


(9.48) 


where all operators and states are time-independent, and pp m is now the density of finite states of the 
electromagnetic field - which in this problem plays the role of atom’s environment. Here the 
electromagnetic field has been assumed to be initially in the ground state - the assumption that will be 
altered later in this section. 


Spontaneous 

photon 

emission 

rate 


T s =xa) 

{fin\a^ 0) 
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P 'fin ’ 


Relation (48), together with Eq. (19), shows that in order for field’s matrix element be different 
from zero, the finite state of the field has to be the first excited Fock state, n = 1. (By the way, this is 
exactly the most practicable way of generating an excited Fock state of a field oscillator field - whose 
existence was taken for granted in our discussion in Sec. 2.) With that, Eq. (48) yields 


Y s =nco 




= nco 


{fin I de d (r)| indj 


2 

P fin ’ 


(9.49) 


20 For some field states, including the squeezed ground states £ discussed in the end of Sec. 5.5, values g (2) (0) may 
be even higher than 2 - the so-called super -bunching. Analysis of one particular case of super-bunching is offered 
to the reader - see the exercise problem list. 
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where the density p\- m of excited electromagnetic field states should be calculated at energy hoo, and ej is 
the component of the vector e(r) along the electric dipole direction. 21 For plane waves, the calculation of 
this density was our first step in this course - see Eq. (1.1). 22 From it, we get 


- dN _ y 8^v 2 dv _ co 2 
dE c 3 dE 7T 2 hc 3 


(9.50) 


where the bounding volume V should be large enough to ensure spectrum’s virtual continuity. Because 
of that, in the normalization condition used to simplify Eq. (9), we may consider e (r) constant. Let us 
present the square of this vector as a sum of squares of its three perpendicular components (one of those, 
ej, aligned with the dipole direction), due to space isotropy we may write 

e 2 = e ] +e 2 u +e 2 L2 = 3e 2 . (9.51) 


As a result, the normalization condition yields 


e 


2 

d 


1 

3^F' 


(9.52) 


and Eq. (49) gives the famous (and very important) formula 23 


r = 


1 4 co 2 

4 7T£ 0 3%C 3 



1 4 co 2 

4 7T£q 3hc 3 



(9.53) 


Free-space 

spontaneous 

emission 

rate 


Leaving a comparison of this formula with the classical theory of radiation, 24 and the exact 

evaluation of r s for a particular transition in the hydrogen atom, for reader’s exercises, let me just 

2 2 2 2 2 

estimate its order of magnitude. Assuming that d ~ er B = eh tm e {e t 4 nsd) and hco~ E B = m e (e I4n& o) lh , 

and taking into account the definition (6.62) of the fine structure constant a « 1/137, we get 


r 

CO 


4ns$ic j 


= cr 


3x10 7 . 


(9.54) 


This estimate says that the emission lines at atomic transitions are typically very sharp. With the 
present-day availability of high-speed electronics, it also makes sense to evaluate the time scale z = 1/T 
of the typical quantum transition: for a typical optical frequency co~ 3xl0 15 s' 1 , it is close to 1 ns. This is 


21 Here I have smuggled back the sum over all electromagnetic field modes j - see Eq. (16). Since in the 
quasistationary approximation, kp « 1, which is necessary for the interaction presentation by Eq. (24), matrix 
elements (49) are independent on kj, the summation is reduced to the calculation of the total p im for all modes. 

22 Note the essential dependence of Eq. (50), and hence of Eq. (53) on the field geometry; all following formulas 
of this section are valid for free 3D space only. If the same atom is place into a high-(7 resonant cavity (see, e.g., 
EM 7.9), the rate of its photon emission is strongly suppressed at frequencies between the cavity resonances 
(where /?f in — > 0) - see, e.g., the review of first experiments by S. Haroche and D. Klepner, Phys. Today 42, 24 
(Jan. 1989). On the other hand, the emission is strongly (by a factor ~ (E/V)Q, where V is cavity’s volume) 
enhanced at resonance frequencies - the so-called Purcell effect, discovered by E. Purcell already in the 1940s. 
For a brief discussion of these and other quantum effects in cavities, see the next section. 

23 An equivalent expression was first obtained in 1930 by V. Weisskopf and E. Wigner, so that the whole 
calculation is sometimes referred to as the Weisskopf- Wigner theory’. 

24 See, e.g., EM Sec. 8.2, in particular Eq. (8.28). 
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exactly the time constant that determines the photon counting statistics of the emitted radiation - see 
Fig. 3. Colloquially, this is the temporal scale of the photon spontaneously emitted by an atom. 25 


Note, however, that the above estimate of r is only valid for a transition with a non-vanishing 
dipole matrix element. If it equals zero - say, due to the initial and final state symmetry - the dipole 
transitions are “forbidden”. (Another commonly used term is the transition selection rules. 16 ) The 
“forbidden” transition may still take place due to a different, smaller interaction (say, via a magnetic 
dipole field of the atom, or its quadrupole electric field 27 ), but would take much longer. In some cases 
the increase of r is rather dramatic - sometimes to hours! Such long-lasting radiation is called 
luminescence - or fluorescence if the initial atom’s excitation was due to an external radiation of higher 
frequency, followed first by non-radiative transitions down the energy level ladder. 


Now let us consider a more general case when the electromagnetic field is initially in an arbitrary 
Fock state n, and from it may either get energy from the atomic system {photon emission) or, vice versa, 
give it back to the atom {photon absorption). For the photon emission rate, an evident generalization of 
Eq. (48) gives 


£, 

r. 


r. 


n — » fin 


r, 


0 — >1 



(9.55) 


where both bra-kets may be taken in the Schrodinger picture, and T s is the spontaneous emission rate 
(53) of the same atomic system. This relation, with the account of Eq. (19), shows that at photon 
emission, the final field state fin has to be the Fock state with n ’ = n + 1, and that 

Total 
(stimulated + 
spontaneous) 
emission rate 

Thus the initial field increases the photon emission rate; this effect is called the stimulated emission of 
radiation. Note that the spontaneous emission may be considered as a particular case of stimulated 
emission for n = 0, and interpreted as the emission stimulated by zero-point fluctuations of the 
electromagnetic field. 

On the other hand, in accordance with the arguments of Sec. 2, for the description of radiation 
absorption the photon creation operator has to be replaced with the annihilation one, to get 


E e ={n + l)r . 


(9.56) 


[a 

r 



(9.57) 


25 The scale c r of the spatial extension of the corresponding wave packet is surprisingly macroscopic - in the 
range of a few millimeters. Such “human” size of the emitted photons makes the optical table the key component 
of many optical experiments. 

26 As was already mentioned in Sec. 5.6, for a single particle moving in a spherically-symmetric potential (e.g., a 
hydrogen-like atom), the selection rules are simple: the only allowed electric-dipole transitions are those with A / = 
/f in - /ini = ±1 and Am = m (ln - m m] = 0. The simplest example of the transition that does not satisfy this rule is that 
between states with n = 2 and n = 1, both with / = 0; because of that, the lifetime of the lowest excited 5-state in 
hydrogen is as long as -0.15 s. 

27 See, e.g., EM Sec. 8.9. 
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According to this equation, the final state of the field at absorption is the Fock state with n’ = n - 1, and 
Eq. (57) yields 28 


r = «r 


(9.58) 


Results (56) and (58) are usually be formulated in terms of between the Einstein coefficients A 
and B defined in the way shown in Fig. 4, where the two energy levels are those of the atom, V a is the 
rate of energy absorption from the electromagnetic field, and T e is that of the energy emission into the 
field. In this notation, Eqs. (56) and (58) say 


A 2l B 2l B u , 

because each of these coefficients equals the spontaneous emission rate r s . 


(9.59) 
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Fig. 9.4. The Einstein coefficients on 
the atomic energy spectrum diagram. 


It is curious that from this point, there is just one step to an alternative derivation of the Bose- 
Einstein statistics for photons. Indeed, in the thermodynamic equilibrium, the average probability flows 
between levels 1 and 2 should be equal: 

W, (r,) = »r 1 (r„), (9.60) 


where W\ and W 2 are the probabilities for the atomic system to be on the corresponding levels, so that 
Eqs. (56) and (58) yield 


W 2 r,{l + n} = W i r s (n), i.e. 


M 

W 1 (n) + 1 ' 


(9.61) 


But, on the other hand, for the atomic subsystem, only weakly coupled to its electromagnetic 
environment, we ought to have the Gibbs distribution of probabilities: 


El 

E 


exp {-E 2 /k B T} = EE ] 

exp {-EJkJ} \ kj J 



(9.62) 


Requiring Eqs. (61) and (62) to give the same result for the probability ratio, we get the Bose-Einstein 
distribution for the electromagnetic field in equilibrium: 

(n) = , (9.63) 

exp {ha>/ k B T}~ 1 

the same as obtained in Sec. 7. 1 by other means - see Eqs. (7.26). 


28 Relations (56) and (58) were conjectured, from very general arguments, by A. Einstein as early as in 1916. 
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Another, very important implication of Eqs. (56) and (58) is the possibility to achieve the 
stimulated emission coherence by level occupancy (or “population”) inversion. Indeed, if Wi > W\, then 
the net power flow from the atomic system into the electromagnetic field, 

power = Tia>y.T s [lE 2 ((n) + l)- Jfj(n)], (9.64) 

may be positive. The necessary inversion may be produced using several ways, notably by a intensive 
quantum transitions to level 2 from an even higher level (which, in turn, is populated, e.g., by absorption 
of an external radiation, called pumping, at a higher frequency.) 

A less obvious feature of the stimulated emission is spelled out by Eq. (55): again, it shows that 
the final state of the field after the absorption of energy h co from the atom is a pure (coherent) Fock state 
(n + 1). Colloquially, one may say that the new, (n + l) st photon emitted from the atom is automatically 
in phase with the n photons that had been in the field mode initially. 29 The idea of stimulated emission 
of coherent radiation using population inversion 30 was implemented in the early 1950s in the microwave 
range ( masers ) and in 1960 in the optical range (lasers). Nowadays, lasers are ubiquitous and constitute 
one of cornerstones of our technological civilization. 

A quantitative discussion of laser operation is beyond the framework of this course, and I have to 
refer the reader to special literature, 31 and would only like to mention only two key points: 

(i) In a typical laser, each generated electromagnetic field mode is in the Glauber (rather than the 
Fock) state, so that Eqs. (56) and (58) are applicable only for n is averaged over the Fock-state 
decomposition of the Glauber state - see Eq. (5.165). 

(ii) Since in a typical laser ( n ) » 1, its operation may be well described using quasi-classical 
theories that use Eq. (64) to describe the electromagnetic energy balance (with the addition of a term 
describing the energy loss due to field absorption in external components of the laser, including the 
useful load), plus the equation describing the balance of occupancies IV] y due to all inter-level 
transitions - similar to Eq. (60), but including also the contribution(s) from the particular population 
inversion mechanism used in the laser. At this approach, the role of quantum mechanics is essentially 
reduced to the calculation of parameter T s . 

The role becomes more prominent if one needs to describe fluctuations of the laser field. Here 
two approaches are possible, following the two options discussed in Chapter 7. If the fluctuations are 
relatively small, one can linearize the Heisenberg equations of motion of the field oscillator operators 
near their stationary-lasing “values”, with the Langevin “forces” (also time-dependent operators) to 
describe the fluctuation sources, and use these Heisenberg-Langevin equations to the radiation 
fluctuations, just as was described in Sec. 7.5. On the other hand, near the lasing threshold the field 
fluctuations are relatively strong, smearing the phase transition between the no-lasing and lasing states. 
Here the linearization is not an option, but one can use the density-matrix approach described in Sec. 
7.6, for the fluctuation analysis. 32 


29 It is straightforward to show that this fact is also true if the field is initially in the Glauber state - which is more 
typical for lasers. 

30 This idea has been traced back at least to an obscure 1939 publication by V. Fabrikant. 

31 I can recommend, for example, P. W. Milloni and J. H. Eberly, Laser Physics, 2 nd ed., Wiley, 2010, and a less 
technical text by A. Yariv, Quantum Electronics, 3rd ed., Wiley, 1989. 

32 This path has been developed (also in the mid-1960s), by several researchers, notably including M. Sully and 
W. Lamb - see, e.g., M. Sargent III, M. Scully, and W. Lamb, Jr., Laser Physics, Westview, 1977. Note that 
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9.4, Cavity QED 

Now I have to mention, at least in passing, the cavity quantum electrodynamics (usually called 
cavity QED for short) - an art and science of creating and using entanglement between quantum states 
of a single atomic system (either an atom, or an ion, or a molecule, etc.) and the electromagnetic field in 
a macroscopic volume called the resonant cavity (or just “resonator”, or just “cavity”). This field is very 
popular nowadays, especially in the context of the quantum computation and communication research 
discussed in Sec. 8. 5. 33 


Let me start its discussion by noting that the narrative of two last sections was based on an 
implicit assumption that the energy spectrum of the electromagnetic field interacting with an atomic 
system is essentially continuous. This assumption has justified the use of Golden Rule, implying that the 
emitted radiation is spread among many field modes, effectively loosing their coherence with the initial 
quantum state of the atom. However, this assumption becomes invalid if the electromagnetic field is 
contained inside a relatively small volume, with a linear size comparable with the radiation wavelength. 
Classical electrodynamics shows 34 that if the walls of such a cavity mostly reflect, rather than absorb, 
radiation, so that in the crude approximation the power dissipation may be disregarded, then particular 
solutions e / (r) of the Helmholtz equation (5) correspond to discrete, well separated mode wavenumbers 
kj and hence well separated eigenfrequencies cOj . Due to the energy conservation, an atomic transition 
corresponding to energy A E = \ E in] - E fm | may be effective only if the corresponding quantum 
oscillation frequency Q = AE/ti is close to one of oy and hence relatively far from other 
eigenfrequencies. 35 As a result, the quantum states of a single atomic system and the resonant 
electromagnetic mode may become entangled. 

A very popular approximation for the qualitative description of this effect is the so-called Rabi 
models in which the atom is treated as a two-level system 37 interacting with a single electromagnetic 
field mode of the resonant cavity. As the reader knows well from Chapters 4-6, any two-level (“spin- 14 ”) 
system may be described by Hamiltonian c • d , and we may always select the state basis in that the 
Hamiltonian is diagonal: 


H 


atom 


= CO 


z 



(9.65) 


where fiQ. = 2c is the energy difference between the eigenenergies in the absence of interaction with the 
field. Next, according to Eq. (17), ignoring the constant ground-state energy fico ! 2 (that may be added to 


while the laser radiation fluctuations may look like a peripheral issue, pioneering research in that field 
has led to the development of the general theory of open quantum systems (which was discussed in 
Chapter 7), that has much broader applications. 

33 Thi s popularity was demonstrated, for example, by the 2012 Nobel Prize in Physics award to cavity QED 
experimentalists S. Haroche and D. Wineland. 

34 See, e.g., EM Sec. 7.9. 

35 On the contrary, if Q is far from any coj, the interaction is much suppressed; in particular, the spontaneous 
emission rate may be much lower than that given by Eq. (53) - so that this result is not as fundamental as it may 
look. 

36 After the pioneering work by I. Rabi in 1936-37. 

37 As was shown in Sec. 6.5, this model is justified, e.g., if transitions between all other energy level pairs have 
considerably different frequencies. 
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Rabi 

Hamiltonian 


the final energy in the very end - if necessary), the contribution of a single mode of eigenfrequency co to 
the Hamiltonian is 


"cavity = hOM a . 


(9.66) 


Finally, according to Eq. (16a), in quantum electrodynamics the electric field of the mode may be 
presented as 


i/f- V /2 

~ . If fico ) 

*(T,t) = - — 

i \ 2 J 


e(r)[ a - 




(9.67) 


so that in the electric-dipole approximation (24), the cavity-atom interaction may be presented as a 
product of the field by one of Cartesian components (say, a y ) of the “spin” operator: 38 


H int = const x d v x £ = const x a x 


fico 


, 1/2 


-\ a - a 


= ifiied \ d-d 


it 


(9.68) 


where k is a coupling constant (with the dimension of frequency). The sum of these terms is called the 
Rabi Hamiltonian, 


11 = ^atom + ^cavity + H int 



+ ficoa^ a + ifnco „ 



(9.69) 


Despite its apparent simplicity, using this Hamiltonian for calculations is not that simple. For 
example, an exact quasi-analytical expression for its eigenenergies (as zeros of a Taylor series in 
parameter k, with coefficients detennined by a recurrence relation) was found only recently. 39 Only in 
the case when the electromagnetic field is very intensive and hence may be treated as the classical one, 
the results following from Eq. (69) are reduced to the Rabi oscillations discussed in Sec. 6.3. 


In the opposite case when the field oscillator is in an essentially quantum state, {a' a ) ~ 1, Eq. 
(69) may be simplified in a different way, assuming that frequencies Q and co are very close, and the 
atom-to-cavity interaction is relatively weak, so that magnitudes of the coupling constant k and the 
detuning parameter (similar to parameter A used in Sec. 6.5), 

^ = Q-co, (9.70) 


are both much smaller than Q « ox To discuss this limit, it is convenient to use the spin ladder operators 
defined absolutely similarly for those of the orbital angular momentum - see Eqs. (5.182): 


a ± = a x ± ic 7 y , so that <j y = 



(9.71) 


From Eq. (4.105), it is easy to find matrices of these operators (in the standard z-basis), 


"0 

2^ 

, a_ = 

'0 

0" 

v0 



v2 



(9.72) 


38 The exact choice of this component is not important, while the formulas simplify if it is proportional to either 
pure cr Y or pure <r v . 

39 D. Braak, Phys. Rev. Lett. 107 , 100401 (2011). 
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and their commutation rules - that are naturally similar to Eqs. (5.183): 


In this notation, the Rabi Hamiltonian looks like 

A hO. „ „•! 

H = a. + ncoa 1 

2 z 


\<j z ,<j±] = +2<j ± . 

(9.73) 


(9.74) 


and it is straightforward to use Eq. (4.199) and (73) to derive the Heisenberg-picture equations of 
motion for the involved operators. (Doing this, we have to remember that operators of the “spin” 
subsystem, on one hand, and of the field mode, on the other hand, are defined in different Hilbert spaces 
and hence commute - at least at coinciding time moments.) The result (so far, exact!) is 


a . „ iic 

a = - icoa 

2 




At • "t 
a ' = icoa ' 


IK / „ 



cr 


= ±z'£2cr, + 12k\ a - a 1 ' 




(9.75) 


Now note that at negligible coupling, /c — > 0, equations (75) have very simple solutions, 

a(t) oc e ~ lC0t , a*' {t) oc e l0}t , & ± (t) oc e ±lQlt , & z (/) « const , (9.76) 

and small terms proportional to k in the right-hand parts of Eqs. (75) cannot affect these time evolution 
laws dramatically even if k is not exactly zero (but small). Of those terms, ones with frequencies close 
to the “basic” frequency of each variable would act in resonance and hence may have a substantial 
impact on system dynamics, while non-resonant terms may be ignored. In this rotating-wave 
approximation (RWA), used several times before in this course, Eqs. (*) are reduced to a much simpler 
system of equations: 


A IK „ Af . »f ik „ 

a = -icoa cr , a = icoa H cr,, 

2 2 + 

it 


/\ •/-'V A ^ /V T /V A /V , a A A /V .|/\T/\ /\ /\ I 

cr =/£2cr +/2«ro\, a i2m a , cr, = ik\ a' cr_ - aa \. 


(9.77) 


Alternatively, these equations of motion may be obtained from the Rabi Hamiltonian after it has 

been cleared of the terms proportional to cr, a ’ and 6 a , that oscillate fast and hence self-average to 
virtually zero: 


A M2 „ . Hk( „ „ 

H = +ncoa' a H era + cr a 

2 2 V 


at /c, o « co, Cl . 


(9.78) 


This is the famous Janes -Cummings Hamiltonian , 40 which is central to the cavity QED and its 
applications. 41 In order to find its eigenstates and eigenenergies, let us note that at negligible interaction 


40 It was first proposed and analyzed in 1963 by two electronic engineers, E. Janes and F. Cummings, and it took 
the physics community a while to recognize and acknowledge the fundamental importance of that work. 

41 In most applications, Hamiltonian (78) is augmented by additional term(s) describing, for example, incoming 
radiation and/or coupling to environment, say due to the electromagnetic energy loss in the cavity walls - see Eq. 
(7.68). 
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(, k — » 0), the spectrum of the total energy E of the system, that in this limit is the sum of two 
independent contributions from the atomic (“spin”) and resonant-cavity subsystems, 

E\ 


K = 0 


m _ ^ na 

= ± h hcon = E ± — , 

2 2 


(9.79) 


consists 42 of close level pairs (Fig. 5) centered to values 


E„ = hco 


n 

2 


, with n = 1,2,... 


(9.80) 


(At the exact resonance co= Q, i.e. at <^= 0, each pair merges into one double-degenerate level E n .) 


E = 0 


TiEL 


A 

hco 

V 

A 

hco 

V 


/ .+ hQ/2 + h(o = E 2 + ha 
■*z- E, = 3hco/2 

^-h£l/2 + 2h(D = E 2 -na 
^+hn/2 = E l +na 

E, = hoot 2 

^-hW2+hco = E l -hZ 


V 


“spin- 1/2” cavity 


^E=-hQ/2 

total svstem 


Fig. 9.5. Energy spectrum 
of the Janes-Cummings 
Flamiltonian at k« \% |. 


Since at k— > 0 the two subsystems do not interact, the eigenstates corresponding to the sublevels 
of /7-th pair may be represented by products of their independent ket-vectors: 

|+) = |T) <S> |/7 - 1) and |-)s|4r)®|/i). (9.81) 


Janes- 

Cummings 

eigenstates 


As we know from Chapter 6, weak interaction leads to strong hybridization of quantum states with close 
energies (in this case, two states (81) with each pair with the same n) and their negligible mixing with 
other states. Hence, at 0 < k« co « Q, a good approximation of an eigenstate with E « E n is given by a 
linear superposition of states (81): 



a) = c + 

T)® 

n - 1) + c_ 


n). 


(9.82) 


with certain c-number coefficients c±. This relation describes the entanglement of atomic eigenstates T 
and nI' with Fock states n and n - 1 of the field mode. 


Let me leave the (straightforward) calculation of coefficients c± and eigenenergies of the two 
entangled state pairs for reader’s exercise. This calculation shows, in particular, that at the exact 
resonance (co = Q), |c+| = |c.| = 1/V2 for both states of each pair. This fact may be interpreted as a 
(coherent!) equal sharing of an energy quantum hco= hQ by the atom and the cavity. 

A by-product of the calculation of c± i is the fact that the dynamics of state a described by Eq. 
(82) is similar to that of the generic two-level system that was repeatedly discussed in this course - first 


42 Besides the non-degenerate ground state level E g = -Mi/2. 


Chapter 9 


Page 21 of 36 


Essential Graduate Physics 


QM: Quantum Mechanics 


time in Sec. 2.6 and then in Chapters 4-6. In particular, if the composite system had been initially 
prepared to be in one component state, for example |T)0|O> (i.e. the atom excited, while the cavity in its 
ground state) and allowed to evolve on its own, after some time interval it may be found in the 
counterpart state |^)®|1), including the first excited Fock state n = 1 of the field mode. This is one more 
(resonant) version of the same method for generation of Fock states of electromagnetic field which was 
discussed in Sec. 3. 43 

Unfortunately, my time devoted to cavity QED is over, and for further reading I have to refer the 
reader to special literature. 44 


9.5. The Klein-Gordon and relativistic Schrodinger equations 

Now let us discuss the basics of relativistic quantum mechanics of particles with a nonvanishing 
rest mass m - i.e., in tenns of Eq. (1), the intennediate range of energies: E ~ me , i.e. for p ~ me. 
Historically, the first attempt 45 to extend the nonrelativistic wave mechanics into the relativistic energy 
range was based on performing the same transitions from classical observables to their quantum- 
mechanical operators as in the nonrelativistic limit: 

p — » p = -iW, E -> H = ih — . (9.83) 

dt 

Substitution of these operators, acting on the Schrodinger-picture wavefunction Hfrr,/), into the classical 
relation between the energy E and momentum p (for of a free particle) leads to the following equations: 


Table 9.1. Deriving the Klein-Gordon equation for a free relativistic particle. 46 



Nonrelativistic limit 

Relativistic case 

Classical 

mechanics 

E = —p 2 
2m 

E 2 =c 2 p 2 +(mc 2 ) 2 

Wave 

mechanics 

/A — T = —(-/7i V) 2 T 

dt 2m 

f d 3 2 

in— *¥ = c 2 {-inv) 2 '¥ + (mc 2 ) 2 '¥ 
V dt) 


43 Another important corollary of the level structure shown in Fig. 5 is the Purcell effect already mentioned in 
Sec. 3. As we already know from Chapter 7, if the system is coupled to environment, the coupling suppresses its 
quantum coherence, in our case the coherence between components of each pair (82). As a result, if the atom is 
initially in state T with higher energy (79), it may perform incoherent (dissipative) transition to the lower-energy 
state i, giving energy tuo to the cavity (n - 1 — > n), which rapidly drains it into the environment. Since the total 
energies of these initial and finite states are close (Fig. 5), the rate of such transitions may be much higher than in 
free space. The quantitative analysis of such enhancement is left for reader’s exercise. 

44 I can recommend, for example, either C. Gerry and P. Knight, Introductory Quantum Optics, Cambridge U. 
Press, 2005’ or G. Agarwal, Quantum Optics, Cambridge U. Press, 2012. 

45 This approach was suggested almost simultaneously in 1926-1927 by (at least) V. Fock, E. Schrodinger, O. 
Klein and W. Gordon, J. Kudar, T. de Donder and F.-H. van der Dungen, and L. de Broglie. 

46 Note that in the sense of Eq. (1), in the nonrelativistic column of this table, the energy is referred to the rest 
energy me 2 , while in the relativistic column, to zero. 


Chapter 9 


Page 22 of 36 






Essential Graduate Physics 


QM: Quantum Mechanics 


Klein- 

Gordon 

equation 


The resulting equation for the nonrelativistic limit is just the usual Schrodinger equation (1.28) 
for a free particle. Its relativistic generalization, usually rewritten as 


JFdt* 


-V- 


T / + //-'P = 0, 


with 


me 

M ‘T’ 


(9.84) 


is called the Klein-Gordon (or sometimes “Klein-Gordon-Fock”) equation. The most fundamental 
solutions of this equation are the same plane, monochromatic waves 

v P(r,f) cc exp{z(k r -cot]. (9.85) 

as in the nonrelativistic case. Indeed, such waves are eigenstates of operators (83), with eigenvalues 

p = tik, E = hco, (9.86) 


so that their substitution into Eq. (84) immediately returns us to Eq. (1) with replacements (86): 


E ± = tico ± = ± 


(tick) 1 + (me 2 ) 2 


1/2 


(9.87) 


Though one may say that this dispersion relation is just a simple combination of the classical 
relation (1) and the same basic quantum-mechanical relations (86) as in nonrelativistic limit, it attracts 
our attention to the fact that energy hco as a function of momentum hk has two rather than one branches, 
with fs_(p) = -E + (p) - see Fig. 6a. 




Fig. 9.6. (a) Free-particle 

dispersion relation resulting from 
the Klein-Gordon and Dirac 
equations, and (b) creation of a 
particle-antiparticle pair from the 
vacuum. 


Historically, this fact has played a very important role for spurring the fundamental idea of 
particle-antiparticle pairs. In this idea (very similar to the concept of electrons and holes in 
semiconductors, which was discussed in Sec. 2.8), what we call the vacuum actually corresponds to all 
states of the lower branch, with energies E_( p) < 0, being filled, while the states on the upper branch, 
with energies E+(p) > 0, being empty. Then an externally supplied energy 

AE = E + -E =E + +(-E_)>2mc 2 >0 (9.88) 

may bring the system from the lower branch to the upper one (Fig. 6b). The resulting excited state is 
interpreted as a combination of a particle (formally, of the infinite spatial extension) with energy E+ and 
momentum p, and a “hole” (antiparticle) of positive energy (-E.) and momentum -p. This idea 47 has led 


47 Due to the same P. A. M. Dirac! 


Chapter 9 


Page 23 of 36 


Essential Graduate Physics 


QM: Quantum Mechanics 


to a search for, and discovery of the positron: electron’s antiparticle with charge q = +e, in 1932, and 
later of the antiproton and other antiparticles. 

Free particles of a finite spatial extension may be described, in this approach, just in the 
nonrelativistic Schrodinger equation, by wave packets: linear superpositions of de-Broglie waves (85) 
with close wave vectors k, and co given by Eq. (87), with the positive sign for the “usual” particles, and 
negative sign for antiparticles - see Fig. 6a above. Note that in order to fonn, from a particle’s wave 
packet, a similar wave packet for the antiparticle, with the same phase and group velocities (2.33) in 
each direction, we need to change the sign not only before co, but also before k, i.e. to replace all 
component wavefunctions (85), and hence the full wavefunction, with their complex conjugates. 


Of more formal properties of the equation, it is easy to prove that its solutions satisfy the same 
continuity equation ( 1 .52) with the probability current density j still given by Eq. ( 1 .47), but a different 
expression for the probability density w - which becomes very similar to that for j : 


w = 


ih 

2mc 1 


f 

'P 

V 


* 


c.c. 

8t 


J 


j = A(tvp 



(9.89) 


(In the nonrelativistic limit v/c — » 0, Eq. (84) allows a reduction of the first relation to Eq. (1.22): w — > 
vpvp* ^ 

The Klein-Gordon equation may be readily generalized to describe a single particle moving in 
external fields; for example, the electromagnetic field effects on a particle with charge q may be 
described by the same replacement as in the nonrelativistic limit (see Sec. 3.1): 48 

p — > P-^fA(r,?), H H - qt/)(r,t) , (9.90) 


where P = -ihV is the canonical momentum operator (3.25), and the vector- and scalar potentials, A and 
(j), should be treated appropriately - either as c-number functions if the electromagnetic field 
quantization is unimportant, or as operators (see Secs. 1-4 above) if it is. 

However, the practical value of the relativistic Schrodinger equation is rather limited, because of 
two main reasons. First of all, it does not give the correct description of particles with spin. For example, 
for the hydrogen-like atom, i.e. the motion of an electron with electric charge —e in the Coulomb central 
field (3.182) of an immobile nucleus with charge +Ze, the equation may be readily solved exactly 49 and 
yields the following spectrum of (doubly-degenerate) energy levels: 


E = me 


1 + 


7 2 2 V 1/2 

Z a 


X 2 


with X = n + 


1 1/2 


l + ~ 
2y 


- Z 2 a 2 


V 


,+ 0 


(9.91) 


where n = 1, 2 ,... and / = 0, I .... , n - 1 are the same quantum numbers as in the nonrelativistic theory 
(see Sec. 3.6), and a = e /Ansofic « 1/137 is the fine structure constant - see Eq. (6.62). The three 
leading terms of the Taylor expansion of this result in small parameter Za are as follows: 


E « me 2 


Z 2 a 2 

l 

N 

-E* 

£ 

( n 

3 y 

In 2 

2 n 4 1 

Q / + 1/2 

~ 4 l 


(9.92) 


48 After such generalization, Eq. (84) is usually called the relativistic Schrodinger equation. 

49 The task left for the reader. 
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The first of these terms is just the rest energy of the particle. The second term, 

2 ZV mZ 2 e 4 1 E 0 2 

E„ = -me — = — 7 777 — i = V, with E 0 = Z E H , 


2n 7 {fins 0 ) 2 h 2 2 n 2 2 n : 

reproduces the nonrelativistic Bohr’s formula (3.191). Finally, the third term, 


(9.93) 


-me 


Z 4 a 4 

( n 

3^ 

2 E 2 n 

( n 

3" 

2 n 

y / + 1/2 

4; 

2 

me 

y / + 1/2 

~4j 


(9.94) 


is just the kinetic -relativistic contribution (6.52) to the fine structure of the Bohr levels (93). However, 
as we already kn ow from Sec. 6.3, for a spin- 'A particle such as the electron, the spin-orbit interaction 
(6.56) gives an additional contribution of the same order to the fine structure, so that the net result, 
confirmed by experiment, is given by Eq. (6.60), i.e. different from Eq. (94). This is very natural, 
because the relativistic Schrodinger equation does not have the very notion of spin. 

Second, even for massive spinless particles (such as Z° bosons), for which this equation is 
believed to be valid, the most important problems are related to particle interactions at high energies of 
the order of A E ~ 2 me (88) and beyond. Due to possibility of creation and annihilation of particle- 
antiparticle pairs at such energies, the number of particles participating in such interactions is typically 
considerable (and variable), and its adequate description of the system is given not by the relativistic 
Schrodinger equation (which is formulated in single-particle terms), but by the quantum field theory - to 
which I will devote just a few sentences in the very end of this chapter. 


9.6, Dirac’s theory 

The real breakthrough toward the quantum relativistic theory of electrons (and any spin-V2 
fermions) was achieved in 1928 by P. A. M. Dirac. For that time, the structure of his theory was highly 
nontrivial. Namely, while formally preserving, in the coordinate representation, the same Schrodinger- 
picture equation of quantum dynamics as in the nonrelativistic quantum mechanics, 50 

3 V F 

ih— = m>, (9.95) 

dt 


it postulates that wavefunction 'P is not a scalar complex function of time and coordinates, but a four- 
component column-vector (sometimes called the bispinor ) of such functions, its Hennitian-conjugate 
bispinor V F I being a 4-component row-vector of their complex conjugates: 


f'J'i(r,0 > l 


V P = 


^(M) 





<(r,0 


<(M) 



(9.96) 


50 After the “naturally-relativistic” form of the Klein-Gordon equation (84), this apparent return to the 
nonrelativistic Schrodinger equation may look very counter-intuitive. However, it becomes a bit less surprising 
taking into account the fact (whose proof is left for the reader) that Eq. (84) may be also recast into form (95) for 
a two-component column-vector (spinor) T, with a Hamiltonian which may be represented by a 2x2 matrix - and 
hence expressed via the Pauli matrices (4. 
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and that the Hamiltonian participating in Eq. (95) is a 4x4 matrix in the Hilbert space of bispinors V F. 
For a free particle, the postulated Hamiltonian looks amazingly simple: 51 


H = ca p + p me ‘ 


(9.97) 


where p = -ihV is the same 3D vector of momentum component operators as in the nonrelativistic case, 
while operators a and f) may be presented in the following shorthand 2x2 form: 


„ (0 G ) 


(i 

0 " 


a = , 

[g oj 

hi 

10 

'"-2 

i 



(9.98a) 


Operator a , composed of the Pauli vector operators g , is also a vector in the usual 3D space, so 
that each of its 3 Cartesian components is a 4x4 matrix. The particular fonn of the 2x2 matrices 
corresponding to operators g and I in Eq. (98a) depends on the basis selected for representation of the 
spin states of the particle; for example, in the standard z-basis, in which the Cartesian components & x , 

<t v , and a 7 of g are represented by the Pauli matrices (4.105), the full matrix form of Eq. (98a) is 


a. 


"0 

0 

0 

0 


fo 

0 

0 

-A 

0 

0 

1 

0 

, a v = 

0 
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i 

0 

0 

1 

0 

0 

7 y 

0 

-i 

0 

0 

. 1 

0 

0 

OJ 


i 

0 

0 

0, 


"0 

0 

1 

o ^ 


fl 

0 

0 

0 ' 

0 

0 

0 

-1 

, p = 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

-1 

0 

v0 

-1 

0 

0J 


10 

0 

0 

-K 


(9.98b) 


(According to the second of Eq. (98a), P has this form in any spin basis.) It is straightforward to use Eqs. 
(98) to verify that matrices a x , a v , o. : and p satisfy the following relations: 

a* = a; = a z = p 2 = 1, (9.99) 

a x a y + a ,,a t = a v a z + a z a v = a.a v + a v a. = a v p + pa v = a v p + pa v = a.p + pa, = 0 , (9. 100) 

i.e. anticommute. 


Acting essentially as in Sec. 4.1, but using commutation relations (99)-(100), it is 
straightforward to show that any solution to the Dirac equation obeys the probability conservation law, 
i.e. the continuity equation (1.52), with the probability density, 


51 Moreover, if the time derivative participating in Eq. (95) and the three coordinate derivatives participating (via 
the momentum operator) in Eq. (97), are merged into one 4-vector operator d/Gx k = {V, dlo(ct ) } , the Dirac 
equation (95) may be rewritten in an even simpler, manifestly Lorentz-invariant 4-vector form (with the implicit 
summation over the repeated index k = 1, ..., 4 - see, e,g., EM Sec. 9.4): 

f a y f o 


dx k 


¥ = 0, where y = fe,y 2 ,y 3 } = 


70 


-70 

0 


74 =P, 


where // = mdfi - just as in Eq. (84). Note also that, very counter-intuitively, the Dirac Hamiltonian (97) is linear 
in momentum, while the non-relativistic Hamiltonian of a particle, as well as the relativistic Schrodinger equation, 
are quadratic in p. In my humble opinion, the Dirac theory (including the concept of antiparticles) may compete 
for the title of the most revolutionary theoretical idea in physics, despite such heavy contenders as the Newton 
laws, the Maxwell equations, the Einstein’s relativity, the Bohr atom, and the Gibbs’ statistical distributions. 


Free- 

particle 

Hamiltonian 


Dirac 
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w = T' t 'E, (9.101) 

and the probability current, 

j = c'F 1 a'F, (9.102) 


looking almost as in the nonrelativistic theory - cf. Eqs. (1.22) and (1.47). Note, however, the Hermitian 
conjugation used in these fonnulas instead of the complex conjugation, in order to fonn scalars w,j x ,j y , 
and j z from 4-component vectors (96). 

This qualified similarity is extended to the fundamental, plane-wave solutions of the Dirac 
equations is free space. Indeed, plugging such solution, in the form 


f„ \ 




'T = ue 


j(k-r-ojt) 


V W 4 J 


(9.103) 


into Eqs. (95) and (97), we get a system of 4 coupled, linear algebraic equations for 4 complex c-number 
amplitudes u\ ,2,3,4. The condition of their consistency yields the same dispersion relation (87), i.e. the 
same two-branch diagram shown in Fig. 6, as follows from the Klein-Gordon equation. The difference is 
that plugging each value of a>, given by Eq. (87), back into the system of equations for amplitudes u, we 
get two solutions for vector u for each of the energy branches. In the standard spin z-basis they may be 
presented as: 


for E - E + > 0 : 


f 1 ^ 


f 0 

0 


1 

cp z 


C(P X ~ IPy) 

E + + me 2 

’ u + l= c + i 

E + + me 2 

C(P X + iPy ) 


-cp z 

, + me 2 , 


, E x + me 2 , 


(9.104a) 


for E = E_ <0: u t = c t 


where c are nonnalization coefficients. 


{ CPz ] 


r c (p> 


E_ - me 2 


E_ 

- me 2 

c{p x + iPy) 


- 

cp z 

E - me 2 

IT 

- me 2 

1 



0 

v 0 , 


K 

1 V 


(9.104b) 


The simplest interpretation of these solutions is that Eq. (103) with vectors u+, given by Eq. 
(104a), represents a spin-'A particle (say, an electron), while that with vectors u. given by Eq. (104b) 
represents an antiparticle (a positron), and two solutions for each particle correspond to two opposite 
directions of spin, <j z = ±1, S z = ±h/2. This interpretation is indeed solid in the nonrelativistic limit, when 
two last components of vector (104a) and two first components of vector (104b) are negligibly small: 
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ro^ 


fo" 


0 


i 


0 


0 

u +t 

0 

’ u -4 

0 

, U_ T — > 

1 

5 u _4. ^ 

0 


10; 


10; 


,°J 




at — — > 0 . 


(9.105) 


me 


In order to show this, let us use the Dirac equation to calculate the Heisenberg-picture law of 
time evolution of operators of the Cartesian components of the orbital angular momentum L = rxp, for 
example of L x = yp z - zp v , taking into account that operators (98a) commute with those of r and p, and 
also the Heisenberg commutation relations (2.14): 


ih 


d _K 

dt 


L x ,H\=ca- [(yp z - zp y ), p] = -ihc(d z p y - a p s ) , 


(9.106) 


with similar relations for two other Cartesian components of the operator. Since the right-hand part of 
these equations is different from zero, the orbital momentum is generally not conserved - even for a free 
particle! Let us, however, consider the following vector operator, 


(9.107a) 



whose Cartesian components, in the z-basis, are represented by 4x4 matrices 



(0 

1 

0 

0^ 


fo 

— i 

0 

07 


fl 

0 

0 

0^ 

h 

1 

0 

0 

0 

„ h 

i 
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0 

0 h 
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, S =- 





, S =- 
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1 

* 2 

0 
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0 
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z 2 

0 

0 

1 

0 


10 

0 

1 

0; 


10 

0 

i 

oj 


10 

0 

0 

-l 


(9.107b) 


and calculate the Heisenberg-picture law of time evolution of these components, for example 

as. 


ih - 


dt 


S X ,H 


= 4 M' 


a x p x + a y p y + a._p 


. ) = ihc(t 


a z P v -a v P 


7 


A direct calculation of the commutators of matrices (98) and (107) yields 


S x ,d x 


= 0, S ,a = iha : , S ,a z 


= -iha 


y’ 


so that we finally get 


ih 


dt 


ihc(t 


<*=Py-<*yP s 


(9.108) 


(9.109) 


(9.110) 


with similar expressions for other two components of the operator. Comparing this result with Eq. (106), 
we see that any Cartesian component of operator (5.198), 


J = L + S . 


(9.111) 


Spin 
operator 
in Dirac’s 
theory 
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Dirac equation 
with EM field 


is an integral of motion, 52 so that this operator may be interpreted as the one presenting the total angular 
momentum. Hence, operator (104) may be interpreted as the spin operator of a spin - l /2 particle (e.g., 
electron). As it follows from the last of Eq. (107b), columns (105) represent the eigenkets of the z- 
component of that operator, with eigenstates S z = ±h/2, depending on the arrow index. So, the Dirac 
theory provides a justification for spin-V 2 - or, somewhat more humbly, replaces the spin hypothesis by 
an assumption of a simpler (and hence more plausible), Lorentz-invariant Hamiltonian (97). 

Note, however, that this fact is not true for the exact solutions ( 1 03)-( 1 04), so that generally the 
eigenstates of the Dirac Hamiltonian are certain linear (coherent) superpositions of component 
wavefunctions describing the particle and its antiparticle - each with both directions of spin. This fact 
leads to several interesting effects, including the so-called Klien paradox at reflection of a particle from 
a tunnel barrier. 53 It is curious that some of these effects may be reproduced in such nonrelativistic 
systems as electron moving in a 2D honeycomb lattice (e.g., in graphene), since they also feature a 
(locally) linear dispersion relation - see Eq. (3. 122). 54 


9.1 . Low-energy limit 


The generalization of the Dirac’s theory to the case of a particle with electric charge q, moving 
in a classically-described electromagnetic field may be obtained using the same Eqs. (90). As a result, 
Eq. (95) becomes 


ca ■ (- itN - qA)+mc 2 ft + {c((/)-h) 


¥ = 0 . 


(9.112) 


where the Hamiltonian operator H is understood in the sense of Eq. (95), i.e. as the partial time 
derivative with multiplier ih. Let us prepare this equation for a low-energy approximation by acting on 
its left-hand part by a similar square bracket (also an operator!), but with the opposite sign before the 
last parentheses. Using relations (99) and (100), and the fact that space- and time-independent operators 
a and ji commute with the spin-independent functions A(r,/) and <f>(r,t), as well as with the 


Hamiltonian operator ihd/dt, the result is 


[a • (- ?T?V - <jA)]“ + {me ) - c a ■(- zT?V - q A), (c/tf) - 77 ) ~{c[(j)- h ) | v t / = 0. (9.113) 


A direct calculation of the first square bracket, using Eqs. (98) and (107), yields 


[a • (- ifiW - <?A)P = (- ifiV - qX) 2 - 2gS • V x A . 


(9.114) 


But according to the last of Eqs. (3.21), the last vector product in the right-hand part is just the magnetic 
field 


?=VxA. 


Similarly, we may use the first of Eqs. (3.21), for the electric field, 




<3A 

aT’ 


(9.115) 

(9.116) 


52 It is straightforward to show that this result remains valid for a particle in the field of central potential U{ r). 

53 See, e.g., A. Calogeracos and N. Dombey, Contemp. Phys. 40 , 313 (1999). 

54 For a review see, e.g., T. Robinson, Am. J. Phys. 80 , 141 (2012). 
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to simplify the commutator participating in Eq. (9. 1 13): 


[a • (- ihV - qA), (q (j) - H )] = 



ifiqa • [V,^] 


-ifiq 


dA 

dt 


itiii - V (j) = ifrqa • £ . (9.117) 


As a result, Eq. (110) becomes 

jc' 2 (— ifN - qAf + [qf - H ) -(me 2 ) -2qc 2 S-3 + ificqa- -0 . (9.118) 

So far, this is an exact result, equivalent to Eq. (112), but more convenient for an analysis of the 
low-energy limit in that not only the offset energy E - me 2 (which is the energy used in nonrelativistic 
quantum mechanics), but also the electrostatic energy of the particle, \q{(/>)\, are much smaller than the 
rest energy mc~. In this limit, the second and third terms of Eq. (118) almost cancel, and introducing the 
offset Hamiltonian 


H = H - me 2 1 . (9.119) 

we may approximate their difference, up to the first nonvanishing term, as 

(q<f>I - h) -(me 2 ) 2 1 = ^q</>I - me 2 1 - H^j - (me 2 ) I « 2mc z ^H - qfil^j . (9.120) 

As a result, after division of all terms by 2 me , Eq. (118) may be approximated as 

Low- 

(9.121) energy 

Hamiltonian 

Let us discuss this important result. The first two terms in the square brackets give the 
Hamiltonian (3.26) that was extensively used in Chapter 3 for the discussion of nonrelativistic motion of 
charged particles. Note again that the contribution of the vector-potential A into that Hamiltonian is 
essentially relativistic, in the following sense: when used for the description of magnetic interaction of 
two charged particles, due to their orbital motion with speed v « c, the magnetic interaction is a factor 
of (v/c) smaller than the electrostatic interaction of the particles. 55 The reason why we did discuss the 
effects of A in Chapter 3 was that is was used there to describe external magnetic fields, keeping our 
analysis valid even for the cases when that field is strong by being produced by relativistic effects - such 
as aligned spins in a permanent magnet. 

The next, third term in the square brackets is also familiar to the reader: it was introduced 
informally in Sec. 4.1, and then formally in Sec. 4.4 to describe the effect of magnetic field on particle’s 
spin - see Eqs. (4.3), (4.5), and (4.163). When justifying this form of interaction, I referred mostly to 
results of Stem-Gerlach-type experiments, but it is extremely pleasing that this result 56 follows from 
such a fundamental relativistic treatment as Dirac's theory. As we already know from the discussion of 


= 

— — (- itiV - qA) 2 + qt/> - —S - 3 + — ^-a • £ 



2m m 2 me 



55 This difference may be traced even by classical means - see, e.g., EM Sec. 5.1. 

56 With the g-factor still equal to exactly 2 - see Eq. (4.116) and its discussion. In order to describe the small 
deviation of g e from 2, the electromagnetic field should be quantized (just as this was done in Secs. 1-4), and its 
potentials A and (j), participating in Eq. (112) should be treated as operators - rather than as c-number functions as 
was assumed above. The calculation of this deviation is one of the basic problems of quantum field theory. Other 
small but important effects of electromagnetic interactions, described by the theory, include the so-called Lamb 
shift of atomic levels - see the end of this chapter for references. 
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the Zeeman effect in Sec. 6.4, the effects of magnetic field on the orbital motion of an electron 
(described by orbital angular momentum L) and its spin S are of the same order, i.e. present an 
essentially relativistic effect. 


Spin-orbit 

coupling 


Finally, the last term in the square brackets of Eq. (121) is also not quite new for us: in particular 
it describes the spin-orbit interaction. Indeed, in the case of classical, spherical-symmetric electric field 
& with potential <f) (r) = U(r)/q, the term may be reduced to Eq. (6.56b): 



(9.122) 


The proof of this correspondence requires a bit of additional work, 57 because in Eq. (121), the term 
responsible for the spin-orbit interaction acts on 4-component wavefunctions, while Hamiltonian (122) 
is supposed to act on nonrelativistic wavefunctions with account of spin, whose coordinate 
representation is given by 2-component columns - spinors: 58 


¥ = 




(9.123) 


The simplest way to prove the identity of the two fonnulas is not to use Eq. (121) directly, but to 
return to the Dirac equation (112), for the particular case of motion in a stationary electric field with no 
magnetic field, when Dirac’s Hamiltonian is reduced to 

H = ca p + f3mc 2 +C/(r). (9.124) 


Since this Hamiltonian is time-independent, we may look for its 4-component eigenfunctions in the form 


¥(r,f) 


v + w 


f E ) 

( \ 

exp 

- i — t 

\¥ \ r )j 


\ h ) 


(9.125) 


where each of y/± is a 2-component column of the type (123), representing two spin states of the particle 
(index +) and antiparticle (index -). Plugging Eq. (125) into Eq. (124), and using Eq. (98a), we get the 
following system of two linear equations: 


E - me 2 - C/(r) y/ + - ca ■ p y/_ = 0, 
E + mc 2 -U( r) y/_ - ca ■ pi// + = 0. 


(9.126) 


57 The only facts immediately evident from Eq. (121) are that the term we are discussing is proportional to the 
electric field, as required by Eq. (122), and that it is of the proper order of magnitude. Indeed, Eqs. (101)-(102) 
imply that in the Dirac theory, ca plays the role of the velocity operator, so that the expectation values of the term 
are of the order of Tiqv£l2mc 2 . Since the expectation values of the operators participating in Hamiltonian (122) 

scale as S ~ Till and L ~ mvr, the spin-orbit interaction energy has the same order of magnitude. 

58 As a reminder, in this course the notion of spinor was introduced earlier for two-particle states - see Eq. (8.14). 
For a single particle, that definition is reduced to i//(r)|.v), whose representation in a particular spin- ‘A basis is a 
column similar to Eq. (123). Also note that spinors (123) may be expanded into a series over the spin-orbitals 
(8.117) discussed in Sec. 8.3, with index j used for numbering both the two directions of spin (i.e. two 
components of spinor's column) and orbital eigenfunctions. 
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Expressing y/. from the latter equation, and plugging the result into the fonner one, we get the following 
single equation for particle’s spinor: 


E + me 


E -me 2 -U(r)-c 2 a p 


-v( r) 


op 


W + =o. 


(9.127) 


So far, this is an exact equation for eigenstates and eigenvalues of Hamiltonian (124). It may be 
substantially simplified in the low-energy limit when both the potential energy 59 and the nonrelativistic 
eigenenergy 

E = E-mc 2 (9.128) 

are much less than me . Indeed, in this case the expression in denominator of the last tenn in the 
brackets of Eq. (127) is close to 2 me 2 . Since a 2 = 1, with that replacement, Eq. (127) is reduced to the 
nonrelativistic Schrodinger equation, similar for both spin components of y/+, and hence giving spin- 
degenerate energy levels. In order to recover small relativistic and spin-orbit effects, we need a slightly 
more accurate approximation: 


1 

E + me 2 - U( r) 


1 

2 me 2 + E - U( r) 


1 


-1 

1 

\ E-U(r) 

2 me 2 

2 me 2 

2 me 2 

2 me 2 


in which Eq. (127) is reduced to 


£-C/(r)- 


2 m 


„ „E-U ( rk ~ 

+ °' p 7 

(2 me ) 


y/ + =0. 


(9.129) 


(9.130) 


As follows from Eqs. (5.46)-(5.47), the operators of momentum and of a function of coordinates 
commute as 

\p,U(r)] = -ihVU , (9.131) 


so that the last term in square brackets of Eq. (130) may be rewritten as 

ifi 


E-^ E-U(y) 2 

(imcf ( 2mcf (2 mef 


(a-VU\a- p). 


(9.132) 


Since in the low-energy limit both terms in the right-hand part of this relation are much smaller 
than the three leading terms of Eq. (130), in the first of them we may replace the nominator with its non- 
relativistic value p 2 / 2m . With this replacement, the tenn coincides with the first relativistic conection 
to the kinetic energy operator - see Eqs. (6.47) and (6.49a). The second term, proportional to the electric 
field = -V(f> = -VU/q, may be transformed further on, using a readily verifiable relation 

(o • Vf/)(o • p) = (VC/) p + its [(VC/)x p] . (9.133) 


Of the two terms in the right-hand part, only the second one depends on spin, 60 giving the following 
spin-orbital interaction contribution to the Hamiltonian, 


59 Strictly speaking, this requirement is imposed on the expectation values of (7(r) in the eigenstates to be found. 
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H m = 


-A-[(Vl/)xp] = -4 T S-[(V ? i)xp]. 


(2 me)' 


n, 2 2 

2m c 


(9.134) 


For a central electric field with <f(r) = <f(r), the potential gradient has only one, radial component: V(f> = 
(dfi/dr) r/r = - fair, and with the angular momentum definition L = rxp, Eq. (134) is reduced to Eq. 
( 122 ). 


As was shown in Sec. 6.3, the perturbative treatment of Eq. (122), together with the kinetic- 
relativistic correction (6.49), in the hydrogen-like atom problem, leads to the fine structure of each Bohr 
level E n , given by Eq. (6.60): 


A E = — 

SO 


2 E. 


me 


3-- 


An 


j + 1/2) 


(9.135) 


This result gets a confirmation from the surprising fact that for the hydrogen-like atom problem, the 
Dirac equation may be solved exactly - without any assumptions. I do not have time/space to reproduce 
the solution, 61 and will list just the final result for the energy spectrum: 


H-like 

atom’s 

eigenenergies 



-1/2 


22 


Z a 


n + {(i + 1/2) 2 -Z 2 a 2 } 1/2 - (j + 1/2) 


(9.136) 


Here n = 1,2, ... is the same main quantum number as in Bohr’s theory, while j is the quantum number 
specifying eigenvalues (5.203) of the total angular momentum’s square J" in the units of h , taking half- 
integer values: j = l ±Vi = 1/2, 3/2, 5/2, ... - see Eq. (5.215). Such set of quantum numbers is rather 
natural, because due to the spin-orbit interaction, the orbital and spin angular momenta are not 
conserved, while their vector sum, J = L + S, is - in the absence of external magnetic field. Each energy 
level (136) is doubly-degenerate, with two eigenstates representing two directions of spin - i.e. two 
values of / =j + '/2 at fixed j . 


2 

Since according to Eq. (1.9), the square of the fine-structure constant a = e lAnsrfiz may be 

2 2 2 

presented as the ratio E^Jmc , the low-energy limit (E - me" ~ E\\ « me ) may be pursued by expanding 
Eq. (136) into the Taylor series in ( Zo ) 2 « 1. The result, 


E « me 2 


zV 

Z 4 a 4 

( 

n 

3^ 


In 2 

2 n 4 

Lu+ i/2 1 

4 j 



(9.137) 


has the same structure, and allows the same interpretation as Eq. (92), but with the last term coinciding 
with Eq. (6.52) - and with experimental results. Historically, this correct description of the fine structure 
of atomic levels provided a decisive proof of Dirac’s theory. 

However, even such an impressive theory does not have too many direct applications. The main 
reason for that was already discussed in brief in the end of Sec. 5: due to the possibility of creation and 


60 The first term gives a small, spin-independent shift of the energy spectrum, which is very difficult to verify 
experimentally. 

61 Good descriptions of the solution are available in many textbooks (the older the better :-), for example see Sec. 
53 in L. Schiff, Quantum Mechanics, 3 rd ed., McGraw-Hill (1968). 
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2 

annihilation of particle-antiparticle pairs at energies higher than 2 me , the number of particles 
participating in high-energy interactions is not fixed. An adequate general description of such situation 
is given by the quantum field theory, in which the particle wavefunction is treated as a field to be 
quantized, using so-called field operators ^(r,?)- very much as the electromagnetic field was treated in 
Secs. 1-4 above. (The Dirac equation follows from the quantum field theory in the single-particle 
approximation.) 

As was mentioned above on several occasions, the quantum field theory is beyond the scope of 
this course, and I have to stop here, referring the interested reader to one of several excellent available 
textbooks on this discipline. 62 (I would strongly encourage the student going in this direction to start 
with playing with the field operators on this or her own, taking clues from Eqs. (16), but replacing the 

creation / annihilations operators a ] and a j of the harmonic oscillator with those of the general second 
quantization formalism outlined in Sec. 8.3.) 


9.8. Exercise problems 

9.1 . Prove the Casimir formula (23) for the attraction force F = -PA between two perfectly 

i j'y 

conducting parallel plates of area A, separated by a narrow vacuum gap d«A . 

Hint : You may like to use the Euler-Maclaurin formula . 63 

9.2 . Radiation of some single-mode quantum sources may have such a high degree of coherence 
that it is possible to observe interference from two independent sources with virtually the same 
frequency, incident on one detector. 

(i) Generalize Eq. (29) to this case. 

(ii) Use the generalized expression to show that incident waves in different Fock states do not 
create an interference pattern. 

9.3 . Calculate the zero-delay value g (2) (0) of the second-order correlation function of a single- 
mode electromagnetic field in the so-called Schrodinger-cat state : a coherent superposition of two 
Glauber states, with equal amplitudes, equal but sign-opposite parameters a, and a certain phase shift 
between them. 

9.4 . Calculate the zero-delay value g (2) (0) of the second-order correlation function of single- 
mode electromagnetic field in the squeezed ground state * defined by Eq. (5.172). 

9.5 . Calculate the rate of spontaneous photon emission (into the unrestricted free space) by a 
hydrogen atom, initially in the 2 p state in = 2,1= 1) with m = 0. Would the result be different for m = ± 


62 For a gradual introduction see, e.g., either L. Brown, Quantum Field Theory, Cambridge U. Press (1994) or R. 
Klauber, Student Friendly Quantum Field Theory’, Sandtrove (2013); on the other hand, M. Srednicki, Quantum 
Field Theoiy, Cambridge U. Press (2007) and A. Zee, Quantum Field Theory in a Nutshell, 2 nd ed., Princeton 
(2010), among many others, offer a steeper learning curve. 

63 See, e.g., MAEq. (2.12). 
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1? for the 2s state (n = 2, l = 0, in = 0)? Discuss the relation between these quantum-mechanical results 
and those given by the classical theory of radiation, using the simplest classical model of the atom. 

9.6 . An electron has been placed at the lowest excited level of a spherically-symmetric, quadratic 
potential well U(r) = m e a> r 12. Calculate the rate of its relaxation to the ground state, with emission of a 
photon (to the free space). Compare the rate with that for a similar transition of the hydrogen atom, for 
the case when the radiation frequencies of these two systems are equal. 

9.7. Derive an analog of Eq. (53) for the spontaneous photon emission into the free space, due to 
a change of its magnetic dipole moment m of a small-size system. 


9.8 . A spin -Vi particle, with the gyromagnetic ratio y, is in its orbital ground state in a dc 
magnetic field 3 0 . Calculate the rate of its spontaneous transition from the higher to the lower energy 
level, with the emission of a photon into the free space. Evaluate the rate for in an electron in a field of 
10 T, and discuss the implications of this result for experiments with electron spins. 

9.9 . Calculate the rate of spontaneous transitions between the two sublevels of the ground state 
of a hydrogen atom, formed as a result of its hyperfine splitting. Discuss the implications of the result 
for the width of the 21 -cm spectral line. 

9.10 . Find the eigenstates and eigenvalues of the Janes-Cummings Hamiltonian (78), and 
discuss their behavior near the resonance point co = Cl. 

9.1 1 . Analyze the Purcell effect, mentioned in Secs. 3 and 4, qualitatively; in particular, calculate 
the so-called Purcell factor Fp, defined as the ratio of the spontaneous emission rates r v of an atom in a 
resonant cavity (tuned exactly to the quantum transition frequency) and that in the free space. 


9.12 . Prove that the Klein-Gordon equation (9.84) may be rewritten in the form similar to the 
nonrelativistic Schrodinger equation, 

ih^- = Hy / , 

dt 


for a two-component wavefunction i//, 64 with the Hamiltonian represented (in the usual z-basis) by the 
following 2x2-matrix: 

H = -(a. +io)^— V 2 +mc 2 a . 

V z y, 2 m 

Use your solution to discuss the physical meaning of the wavefunction’s components. 


9.13 . Calculate and discuss the energy spectrum of a relativistic, spinless, charged particle placed 
into an external uniform, time-independent magnetic field 3. Use the result to fonnulate the condition of 
validity of the nonrelativistic theory. 


64 Here ys is a function of both r and t, and the lower-case letter is used only to distinguish this two-component 
spinor from the scalar function ¥(!•, t) obeying the Klein-Gordon equation. 
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Hint : Reduce the relativistic Schrodinger equation, describing the problem, to the nonrelativistic 
one describing the same problem, with some effective parameter(s). 

9.14 . Prove Eq. (91) for the energy spectrum of a hydrogen-line atom, calculated from the 
relativistic Schrodinger equation. 

Hint: Use the fact that, as a mathematical analysis of Eq. (3.184) shows, its eigenvalues are given 
by Eq. (3.191), s n = -1/2 n , with n = l + 1 + n r , where n r = 0, 1,2,..., even if / is not integer. 65 

9.15 . Derive the general expression for the differential cross-section of the elastic scattering of a 
spinless relativistic particle by a static potential U{ r), in the Born approximation, and formulate the 
conditions of its validity. Use these results to calculate the differential cross-section of scattering of a 
particle with electric charge -e by the Coulomb electrostatic potential (fiv) = ZelAn&f. 

9.16 . Calculate the commutator of operator l and the Dirac Hamiltonian of a free particle. 
Compare the result with that for the nonrelativistic Hamiltonian of a free particle, and interpret the 
difference. 

9.17 / In the Heisenberg picture of quantum dynamics, derive an equation describing time 
evolution of free electron’s velocity in the Dirac theory. Solve the equation for the simplest state, with 
constant energy and momentum, and discuss the solution. 

9.18 .* Calculate the eigenstates and eigenenergies of a spin- 1 //: particle with charge q, placed into 
a uniform, time-independent external magnetic field 3. Compare the calculated energy spectrum with 
those following from the non-relativistic theory and the relativistic Schrodinger equation. 

9.19 / Following the recommendation in the end of Chapter 9 of the lecture notes, introduce the 
quantum field operators i/7 , which would be related to the usual wavefunctions (//just as the EM field 
operators (9.16) are related to the classical electromagnetic fields, and explore the basic properties of 
these operators. (For this preliminary study, consider just the fixed-time situation.) 


65 Actually, the key relation (3.192), n > l + 1, mathematically stems from the fact that the “genuine” quantum 
number of the radial problem, n r , can only take non-negative integer values. 
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Chapter 10. Making Sense of Quantum Mechanics 

This ( very cryptic) chapter addresses the issues of quantum mechanics interpretation that are still a 
subject of debate - fortunately not affecting practical applications of the quantum theory . 


10.1. Hidden variables and local reality 

Only now, with a quantitative understanding of the principles of quantum mechanics, we are 
ready to proceed to the discussion of its interpretation 1 - the issue which is very closely related to 
problems of measurements, already discussed in Sec. 7.7. As was already mentioned in that section, the 
founding fathers of quantum mechanics have not left much guidance on these topics, because in the first 
years after the advent of this exciting new theory they gave understandable preference to using it for 
deriving new particular results, and then were much distracted by the development of nuclear physics 
and its applications. This is why, after a very important but inconclusive discussion between A. Einstein 
and N. Bohr in the mid-30s, the debates of quantum measurements and the related conceptual issues of 
quantum mechanics have resumed only in the 1950s. They have led to a key contribution by J. Bell in 
the early 1960s, and an important experimental work on verifying Bell’s inequalities (see below), but 
besides that work, the recent progress is marginal, and opinions of even prominent physicists on certain 
issues are still very much different. 

Perhaps the central controversial issue is question (iii) posed in Sec. 7.7: what (if any :-) is the 
“real” state of a quantum-mechanical system before a nearly-perfect measurement giving a certain 
outcome? In order to be specific, let us focus again on the simplest example of Stem-Gerlach 
measurements of spin- 'A particles - because of their physical transparency and technical simplicity. 2 As 
the reader knows very well by now, even in a pure quantum spin state (for example, T), i.e. the least 
uncertain state of the system, the results of the Stem-Gerlach measurements of other spin component 
are still uncertain. Indeed, as we know from Sec. 4.4, the ket-vector of this state may be presented as 

|t) = -^(j^) + |^)), (10.1) 

so that the probabilities of measuring any of values S x = +hl 2 and S x = -h/2 equal 50%. So, has the spin 
had a certain value of S x a split second before the Stern-Gerlach measurement that gave a certain 
outcome, for example S x = +hJ 2? For a classical system, with perfect detectors, the answer is definitely 
yes. In this case, the pre-measurement probability of 50% just reflects the degree of our ignorance about 
the real state of the system, and the detector merely reveals it. 

However, the situation in quantum mechanics is different, and such interpretation is impossible, 
as was clearly shown in the famous EPR paper published in 1935 by A. Einstein, B. Podolsky, and N. 


1 I believe that another popular name for this group of issues, “foundations of quantum mechanics”, is hardly 
appropriate. The only reliable foundation of physics (or any other genuine scientific discipline) is a set of 
experimental facts. 

2 As was discussed in Sec. 7.7, Stem-Gerlach-type experiments may be readily made almost “perfect”, i.e. 
virtually unaffected by instrument imperfections, provided that we do not care about the state of the particle after 
a single-shot measurement. 
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Rosen. Its original discussed thought experiments with a pair of ID particles prepared in a quantum 
state in that both the sum of their momenta and difference of their coordinates are exactly fixed: p\ + p 2 
= 0, x\ - X 2 = a? However, usually the discussion is recast into an equivalent Stern-Gerlach experiment 
shown in Fig. la. 4 A source emits rare pairs of spin- 'A particles, propagating in opposite directions, with 
exactly zero net spin, but otherwise in random spin states. After the spatial separation of the particles 
has become sufficiently large (see below), the spin state of each of them is measured with a Stern- 
Gerlach detector, one of them (Fig. 1, detector SGi) somewhat closer to the particle source, so it makes 
the measurement first, at time l\ < A 



Fig. 10. 1. (a) General scheme 
of two-particle Stern-Gerlach 
experiments, and (b) the 
orientation of the detectors, 
assumed at the devivation of 
Bell’s inequality (14). 


First, let the detectors be oriented say along the same direction, say axis z. Evidently, the 
probability of each detector to give any of values S z = ±fi!2 is 50%. However, if the first detector had 
given result S z = -fi/2, even before the second detector’s measurement, we know that it will give result S z 
= +HI2 with 100% probability. So far, the result allows for a classical interpretation, just for the single- 
particle measurements discussed in Secs. 2.5 and 7.7. Thus we may fancy that the second particle really 
has a definite spin before the measurement, and the first measurement has just removes our ignorance 
about that reality. In other words, the change of probability is due to the statistical ensemble 
redefinition: the 50% probability belongs to the ensemble of all experiments, while the 100% 
probability, to the sub-ensemble of experiments with the S z = -Til 2 outcome of the first experiment. 

However, let the source generate the particle pairs in the entangled, singlet state (8.19), 

( 10 . 2 ) 

that certainly satisfies the above assumptions: the probability of each S z value of any particle is 50%, the 
sum of both S z is exactly zero, and if the first detector’s result is S z = -Til 2, then the state of the remaining 
particle is T, with zero uncertainty. Now let us use Eq. (1), and its counterpart for vector |4^), 5 to present 
the same initial state (2) in the form 


3 This is possible, because the corresponding operators commute: [/;, -p 2 ,x l +x 2 \ = \p 1 ,x l ]-\p 2 ,x 2 ] = 0. 

4 Another convenient experimental technique of entangled state generation, frequently used in this field, is the 
four-wave mixing (FWM) of optical photons. Its brief discussion may be found, for example, in CM Sec. 5.5. 

5 As a reminder, it differs from Eq. (1) only by the sign in the parentheses - see, e.g., Eqs. (4. 123). 
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Opening the parentheses (without swapping the ket-vector order!), we get an expression similar to Eq. 
(2), but now for the x-basis: 




(10.4) 


Hence if we use the first detector (closest to the particle source) to measure S x rather than S~, then after it 
had given as certain result (say, S x = -hi 2), we know for sure, before the second particle spin’s 
measurement, that its S x component equals +h/2. 

So, depending on the experiment perfonned on the first particle, the second particle turns out to 
be in one of two states - either with a definite component S z or with a definite component S x , in each 
case without any uncertainty. Evidently, this situation cannot be interpreted in classical terms if the 
particles do not interact during the measurements. A. Einstein in was deeply unhappy with such 
situation, because it did not satisfy the general requirement to any theory, which nowadays is called the 
local reality. His definition of this requirement was as follows: 

“The real factual situation of system 2 is independent of what is done with system 1 that is 
spatially separated from the former 

(Here the tenn “separated” in this sentence is a bit uncertain, but from the context it is clear that Einstein 
meant the detector separation by a superluminal interval, i.e. by distance 

Ir, - r 2 | > c\t x -t 2 \, (10.5) 


where the measurement time difference, participating in the right-hand part, includes the measurement 
duration.) In Einstein’s view, since quantum mechanics does not satisfy the local reality condition, it 
cannot be considered a complete theory of Nature. 

This situation naturally raises the question whether something (usually called hidden variables ) 
may be added to the quantum-mechanical description in order to satisfy the local reality requirement. 
The first definite statement in this regards was J. von Neumann’s “proof’ 6 (first famous, then infamous 
:-) that such variables cannot be introduced; for a while his work satisfied quantum mechanics 
practitioners. 7 A major new contribution to the problem was made only in the 1960s by J. Bell. 8 First of 
all, he has found an elementary (in his words, “foolish”) error in von Neumann’s logic, which voids his 
“proof’. Second, he demonstrated that Einstein’s local reality condition is incompatible with 
conclusions of quantum mechanics - that had been, by that time, confirmed by too many experiments to 
be seriously questioned. Since no hidden variable introduction can change this situation, in this sense 
such introduction is impossible. 


6 In his pioneering book J. von Neumann, Mathematische Grundlagen der Quantenmechanik [Mathematical 
Foundations of Quantum Mechanics], Springer, 1932. (The first English translation was published only in 1955.) 

7 Evidently, it would not satisfy A. Einstein, but reportedly he did not know about von Neumann’s result before 
signing the EPR paper. 

8 See, e. g., J. S. Bell, Rev. Mod. Phys. 38, 447 (1966), or J. S. Bell, Foundations of Physics 12, 158 (1982). 
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Let me describe a particular version of Bell’s proof (suggested by E. Wigner), using the same 
EPR pair experiment (Fig. la), in that each SG detector may be oriented in any of 3 directions: a, b, or c 
- see Fig. lb. As we kn ow from Chapter 4, if a fully-polarized beam of spin-A particles is passed 
through a Stern-Gerlach apparatus forming angle <f) with the polarization axis, the probabilities of two 
counterpart outcomes of the experiment are 

Quantum- 

( 10 - 6 ) 

probabilities 

Let us use this formula to calculate all joint probabilities of measurement outcomes, starting 
from the detectors 1 and 2 oriented, respectively, in directions a and c. Since the angle between negative 
direction of axis a and positive direction of axis c is <j) a+,c- = n - cp (see the dashed arrow in Fig. lb), we 
get 

W(a + ,c + ) = W(a + )W(c + \ a + ) = -cos 2 = -cos 2 — = -sin 2 ^ . (10.7) 

" " +l + 2 2 2 2 2 2 

Absolutely similarly, 

W(c + ,b + ) = W( C+ W(b + \ C+ ) = ^sin 2 (10.8) 

W(a + ,b + ) = W(a + )W(b + ja + ) = |cos 2 n = |sin 2 cp. (10.9) 



Now note that for any angle (p smaller than nil (as in the case shown in Fig. lb), 

— sin 2 <p > — sin 2 — + — sin 2 — = sin 2 — . (10. 10) 

2 2 2 2 2 2 


2 

(For example, for cp — » 0 the left-hand part of this relation tends to cp /2, while the right-hand part, to 
iff /4.) Hence the quantum-mechanical result gives, in particular, 


On the other hand, we may compose another inequality for the same probabilities without 
calculating them from any particular theory, but using the local reality assumption. Let us list all 
possible outcomes of detector measurements, taking into account the zero net spin: 


Quantum- 
mechanical 
result 
for joint 
probabilities 


W(a + ,b + ) 


W(a + ,c + ) — h 

► 


► 



^ ► 

W{c + ,b + ) — i 

► 


Detector 1 
results 

Detector 2 
results 

Probability 

a+, b+, c+ 

a., b., c. 

W] 

a+, b+, c. 

a., b., c+ 

W 2 

a+, b., c+ 

a., b+, c. 

w 3 

a+, b., c. 

a., b+, c+ 

w 4 

a., b+, c+ 

a+, b., c. 

w 5 

a., b+, c. 

a+, b., c+ 

W 6 

a., b., c+ 

a+, b+, c. 

W 7 

a., b., c. 

a+, b+, c+ 

W 8 
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From the local reality point of view, these measurement options are independent, so we may 

write: 

W(a + ,c + ) = W 2 +W 4 , W(c + ,b + ) = W i +W 7 , W(a + ,b + ) = W 3 +W 4 . (10.12) 

On the other hand, since no probability may be negative (by its very definition), we may always write 

W 3 +W 4 <(W 2 +W 4 ) + {W 3 +W 7 ). (10.13) 

Plugging into this inequality the values of these two parentheses, given by Eq. (12), we get 


(10.14) 

This is (one of several possible forms of) the Bell’s inequality that has to be satisfied by any local-reality 
theory; it directly contradicts the quantum-mechanical result (11). 

Though experimental tests of the Bell’s inequalities had been started in the late 1960s, the 
interpretation of first results was vulnerable to two criticisms: 

(i) The detectors were not fast enough and not far enough to have relation (5) satisfied. This is 
why, as the matter of principle, there was a chance that information on one measurement had been 
transferred (by some, mostly implausible) means to particles before the second measurement - the so- 
called locality loophole. 

(ii) Particle detection efficiencies were too low to have sufficiently small error bars for both parts 
of the inequality - the detection loophole. 

Gradually, these loopholes have been closed. 9 As expected, substantial violations of Bell 
inequalities equivalent to Eq. (14) have been proved, essentially rejecting any possibility to reconcile 
quantum mechanics with Einstein’s local reality requirement. 




inequality for 
local-reality 


W(a + ,b + )<W(a + ,c + ) + W(c + ,b + ). 


10.2. Interpretations of quantum mechanics 

The fact that quantum mechanics is incompatible with local reality, makes it reconciliation with 
our (classically-bred) “common sense” rather challenging. Here is a brief list of the major interpretations 
of quantum mechanics, that try to provide at least a partial reconciliation of this kind: 

(i) The so-called Copenhagen interpretation, to which most physicists subscribe. This 
“interpretation” does not really interpret anything; it just states the internal randomness of measurement 
results in quantum mechanics, essentially saying: “Do not worry; this is just how it is; live with it”. For 
me personally, this interpretation, at least in its most frequently repeated forms, has only one, rather 
pedagogical weakness: though it implies statistical ensembles (otherwise, how would you define the 
probability?), but does not put a sufficient emphasis on their role, in particular the possible ensemble 


9 Important milestones on that way were experiments by A. Aspect et al., Phys. Rev. Lett. 49 , 91 (1982) and M. 
Rowe et al.. Nature 409 , 791 (2001). A detailed review of the experimental situation was given, for example, by 
M. Genovese, Phys. Repts. 413 , 319 (2005); see also more recent experiments by D. Matsukevich et al., Phys. 
Rev. Lett. 100 , 150404 (2008) and D. Salart et al.. Nature 454 , 861 (2008). Presently, a low-noise demonstration 
of the Bell inequality violation has become a standard test in each experiment with entangled qubits used for 
quantum encryption research - see Sec. 8.5. 


Chapter 10 


Page 5 of 6 


Essential Graduate Physics 


QM: Quantum Mechanics 


redefinition as the only key point of human involvement in the measurement process. 10 Perhaps the most 
impressive objection to the Copenhagen interpretation was given by A. Einstein during his 1935 
discussion with N. Bohr: “God does not play dice.” OK, when Einstein speaks, we all should listen, but 
perhaps when God speaks (through the experimental results), we have to pay even more attention. 

(ii) Non-local reality. After the dismissal of von Neumann’s “proof’ by J. Bell, to the best of my 
knowledge, there has been no proof that hidden parameters could not be introduced, provided that they 
do not imply the local reality. Of constructive approaches, perhaps the most notable contribution was 
made by D. Bohm 11 who developed the L. de Broglie’s interpretation of the wavefunction as a “pilot 
wave”, making it quantitative. In the wave mechanics version of this concept, the wavefunction, 
governed by the Schrodinger equation, just guides a real, point-like classical particle whose coordinates 
serve as hidden variables. However, this concept does not satisfy the notion of local reality. Namely, the 
measurement of particle’s coordinate at a certain point iq has to instantly change the wavefunction 
everywhere, including points Y 2 in the superluminal interval range (4). So, Bohm’s hidden variables 
would hardly make A. Einstein happy. After having recognized this problem, D. Bohm abandoned his 
theory - in J. Bell’s view, perhaps too early. In my personal taste, however, the assumption of such (in 
Einstein’s words) “spooky action at a distance” is too large a sacrifice to save the classical determinism. 

(iii) The many-world interpretation introduced in 1957 by H. Everitt and popularized in the 
1960s and 1970s by B. de Witt. In this interpretation, all possible measurement outcomes do happen, 
splitting the Universe into the corresponding number of “parallel” Universes, so that from one of them, 
other Universes and hence other outcomes cannot be observed. Let me leave to the reader an estimate of 
the rate at which the parallel Universes being constantly generated (say, per second), taking into account 
that such generation should take place not only at explicit lab experiments, but at any irreversible 
process such as fission of any atom nucleus or absorption of a photon, everywhere in each Universe - 
whether its result is recorded or not. Even the main proponent of this interpretation, B. de Witt, has 
confessed: “The idea is not easy to reconcile with common sense”. I agree. 

(iv) The quantum logic. In desperation, some physicists turned philosophers have decided to 
dismiss the very logic we are using - in science and elsewhere, so that a statement like “the Bell 
inequalities are violated” would not make any definite sense. OK, if we dismiss the formal logic, I do 
not know how we can use any scientific theory and make any predictions - until the quantum logic 
experts tell us what to replace the classical logic with. To the best of my knowledge, so far they have not 
done that, at least for the measurement process. I personally trust J. Bell’s opinion: “It is my impression 
that the whole vast subject of Quantum Logic has arisen [. . .] from the misuse of a word.” 

The weakness of all interpretations of quantum mechanics is that, as far as I know, neither of 
them has yet provided any suggestion how this particular interpretation might be tested experimentally 
to exclude other ones. On the positive side, there is a consensus that quantum mechanics makes correct, 
if sometimes probabilistic, predictions of all reliable experimental results we are aware of. Maybe, this 
is not that bad for a scientific theory. 12 


10 A detailed discussion of statistical ensemble’s role may be found, e.g., in L. Balentine, Quantum Mechanics, 
World Scientific, 1998. 

11 D. Bohm, Phys. Rev. 85, 165; 180 (1952). 

12 If the reader is not satisfied with this “positivistic” approach, and wants to improve the situation, my earnest 
advice would be to start not from square one, but from reading what other (including some very clever!) people 
thought about it. A good starting point is the review collection by J. Wheeler and W. Zurek (eds.), Quantum 
Theory’ and Measurement, Princeton U. Press, 1983. 
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